-2

I got a problem in C++ where I need to parse a file with multiple lines (random strings with random length) and transform the uppercase characters in lowercase character and store the lines in a vector of strings. I'm trying to parse the file character by character but don't know how to identify the end of a line.

Ovidiu Firescu
  • 353
  • 1
  • 9
  • 4
    Use `std::string` and `std::getline` to read and store lines of text. – NathanOliver Jul 11 '19 at 20:02
  • Just use `std::getline` or alternatively check for `'\n'` character. – ruohola Jul 11 '19 at 20:04
  • Doing this, is the same as storing the file in a vector of strings and than parsing the vector of strings line by line, char by char and transforming the uppercase in lowercase. I want to be as efficient as possible, by parsing only one time the whole file and doing the transformation while parsing the file. Edit: Tried to check the '\n' but doesn't work – Ovidiu Firescu Jul 11 '19 at 20:04
  • For sure, `'\n'` is the end of line character. If that isn't working for you, you're doing it wrong. – Khouri Giordano Jul 11 '19 at 20:27
  • ` ifstream test("test.txt"); char character; while (test >> character) { if (character == '\n') cout << "Worked"; }` This is how I tested with \n and it never wrote Worked in console. – Ovidiu Firescu Jul 11 '19 at 20:29
  • Note that checking for `'\n'` is harder than it looks . `'\n'` is a whitespace character and is usually discarded when performing formatted reads. – user4581301 Jul 11 '19 at 20:48

2 Answers2

1

If you really want to parse a line character for character, then you have a lot of work. And, you depend a little bit on your environment. Lines could be terminated with '\n' or '\r' or "\r\n".

I would really recommend to use a function that has been designed to get a complete line. And this function is std::getline. If your line would not contain white spaces, you could also read a string directly with the extractor operator like this: std::string s; ifstreamVariable >> s;

To be independent of such behavior, we can implement a proxy class to read complete lines and put this into a std::string.

The file can be read into a vector, with the proxy class and the range based vector constructor.

For transforming to lowercase, we will use std::transform. That is very simple.

Please see the following example code.

#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <sstream>

std::istringstream testDataFile(
R"#(Line 0 Random legth asdasdasfd
Line 1 Random legth asdasdasfd sdfgsdfgs sdfg
Line 2 Random legth asdasdasfd sdfgs sdfg
Line 3 Random legth a sddfgsdfgs sdfg
Line 4 Random legth as sds sg
)#");


class CompleteLine {    // Proxy for the input Iterator
public:
    // Overload extractor. Read a complete line
    friend std::istream& operator>>(std::istream& is, CompleteLine& cl) { std::getline(is, cl.completeLine); return is; }
    // Cast the type 'CompleteLine' to std::string
    operator std::string() const { return completeLine; }
protected:
    // Temporary to hold the read string
    std::string completeLine{};
};

int main()
{
    // Read complete source file into maze, by simply defining the variable and using the range constructor
    std::vector<std::string> strings{ std::istream_iterator<CompleteLine>(testDataFile), std::istream_iterator<CompleteLine>() };

    // Convert all strings in vector ro lowercase
    std::for_each(strings.begin(), strings.end(), [](std::string& s) { std::transform(s.begin(), s.end(), s.begin(), ::tolower); });

    // Debug output:  Copy all data to std::cout
    std::copy(strings.begin(), strings.end(), std::ostream_iterator<std::string>(std::cout, "\n"));

    return 0;
}

This is the "more"-C++ way of implementing such a problem.

By the way, you can replace the istringstream and read from a file. No difference.

Armin Montigny
  • 7,879
  • 3
  • 11
  • 29
  • As I understand you store the data from the file in the vector of strings and than using transform to transform the upper characters into lower characters. This is a way to avoid the end of line, but isn't this method less efficient than reading the whole data from file only once and doing the transformation while reading? Also could have used copy for reading the data into the vector of strings. Didn't mention in the description, but the lines don't have spaces – Ovidiu Firescu Jul 11 '19 at 20:45
  • 1
    You could use also std::copy with a std::backinserter. But that's one statement more. This method reads the file only once into the vector. So, into memory. But yes, then you are right, I need an additional iteration over all characters in the vector. We could also do the transformation already in the proxy class. That would save the iteration over the vector. But we would still have the transform loop. I am not sure, if it would save more time to do a char based input . . . Need to test it. – Armin Montigny Jul 11 '19 at 20:56
  • I think I will use your method, the way I did it before posting was with a copy and a transform loop. Than I though I could parse the file character by character but couldn't find a way to identify when a new line is read. I wanted to make the program as efficient as possible because it will be tested on a large block of data. – Ovidiu Firescu Jul 11 '19 at 21:04
  • Found a way, you can read a file character by character with `nameFile.get(nameCharacter)`, get() will read the end of line and that way you can tell when a line has ended. – Ovidiu Firescu Jul 11 '19 at 21:48
-2

ifSteamObject.eof( ) is used to check the end of file

if(!obj.eof())
{
   //your code
}

Using this will solve your problem

  • My mistake, I wrote in the title end of file, but in the problem and description I explained I need the end of line. – Ovidiu Firescu Jul 11 '19 at 20:21
  • 1
    https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-i-e-while-stream-eof-cons – hegel5000 Jul 11 '19 at 20:23
  • 1
    It’s used to check that an attempted input operation **failed** because it hit the end of the file. Until an operation has failed it will return false. So it’s not a generic test for end of file. It’s usually irrelevant. A program normally reads input until it fails, and then it’s done. – Pete Becker Jul 11 '19 at 20:24