0

I have seen how to remove specific chars from a string but I am not sure how to do it with a file open or if you can even do that. Basically a file will be open with anything in it, my goal is to remove all the letters a-z, special characters, and whitespace that may appear so that all that is left is my numbers. Can you easily remove all chars rather than specifying a,b,c etc when the file is open or would I have to convert it to a string? Also would it be better to do this in memory?

My code this far as is follows:

#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main() {

    string filename;
    cout << "Enter the name of the data file to open" << endl;
    cin >> filename >> endl;

    ofstream myfile;
    myfile.open(filename);

    if (myfile.is_open()) { //if file is open then

        while(!myfile.eof()){ //while not end of file
                                //remove all chars, special and whitespace
        }
    }
    else{
        cout << "Error in opening file" << endl;
    }
        return 0;
}
  • 3
    Start with this: http://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong. – Fantastic Mr Fox Aug 31 '16 at 19:52
  • 4
    Why not extract only the numbers using [isdigit()](http://www.cplusplus.com/reference/cctype/isdigit/) and ignore everything else, instead? – Khalil Khalaf Aug 31 '16 at 19:53
  • 1
    Then, once you have read into a string just get all the numeric characters and write them back into a separate file. – Fantastic Mr Fox Aug 31 '16 at 19:53
  • @FirstStep would me only extracting the numbers mess up the format? That is why I decided t do it this way. Because one line could be 272 and the next one 345 I don't want it to mush into 272345 – I'm here for Winter Hats Aug 31 '16 at 19:55
  • @user5468794 But you said you wanted to remove all whitespace ??? – Fantastic Mr Fox Aug 31 '16 at 19:56
  • You will have trouble with 1.0e2 –  Aug 31 '16 at 19:56
  • @user5468794 it depends on the problem. I don't know because you did now provide sample inputs and the desired outputs. And how would you like to populate the numbers: each line is a whole number.. etc. – Khalil Khalaf Aug 31 '16 at 19:57

2 Answers2

0

Preliminary remarks

If I understand well, you want to keep only the numbers. Maybe it's easier to retain chars that are ascii numbers and eliminate the others rather than eliminate a lot of other chars classes and hope that the remainder is only numbers.

Also never loop on eof to read a file. Loop on the stream instead.

finally, you should read from an ifstream and write to an ofstream

First approach: reading strings

You can read/write the file line by line. You need enough memory to store the largest line, but you benefit from buffering effect.

if (myfile.is_open()) { //if file is open then
    string line;
    while(getline(myfile, line)){ //while succesful read
        line.erase(remove_if(line.begin(), line.end(), [](const char& c) { return !isdigit(c); } ), line.end()); 
        ... // then write the line in the output file 
    }
}
else ...

Online demo

Second approach: reading chars

You can read/write char by char, which gives very flexible option for handling individual characters (toggle string flags, etc...). You also benefit from buffering, but you have function call overhaead for every single char.

if (myfile) { //if file is open then
    int c; 
    while((c = myfile.get())!=EOF){ //while succesful read
                        //remove all chars, special and whitespace
        if (isdigit(c) || c=='\n') 
            ... .put(c); // then write the line in the output file 
    }
}
else ...

Online demo

Other approaches

You could also read a large fixed size buffer, and operate similarly as with the strings (but don't eliminate LF then). The advantage is that the memory need is not impacted by some very large lines in the file.

You could also determine the file size, and try to read the full file at once (or in very large chunks). You'd then maximize performance at the cost of memory consumption.

Christophe
  • 54,708
  • 5
  • 52
  • 107
-2

This is just an example in order to extract all chars you want from a file with a dedicated filter:

std::string get_purged_file(const std::string& filename) {
  std::string strbuffer;
  std::ifstream infile;

  infile.open(filename, std::ios_base::in);
  if (infile.fail()) {
    // throw an error
  }

  char c;
  while ((infile >> c).eof() == false) {
    if (std::isdigit(c) || c == '.') {
      strbuffer.push_back(c);
    }
  }

  infile.close();
  return strbuffer;
}

Note: this is just an example and it has to be subject to optimizations. Just to give you an idea:

  • Read more than one char at time, (with a proper buffer).
  • Reserve memory in string.

Once you have the buffer "purged" you can overwrite your file on save the content into another file.

BiagioF
  • 8,545
  • 2
  • 21
  • 45
  • @FirstStep I cannot get it. The answer says *"Because iostream::eof will only return true after reading the end of the stream. It does not indicate, that the next read will be the end of the stream."* But that does not affect my code, so what do you mean? – BiagioF Aug 31 '16 at 20:14
  • @FirstStep please enlighten me. – BiagioF Aug 31 '16 at 20:17
  • Worst case of testing for EOF (which is usually wrong): `while ((infile >> c).eof() == false)` –  Aug 31 '16 at 20:24
  • @DieterLücking I didn't get it. What do you mean? – BiagioF Aug 31 '16 at 20:26