0

I have a .csv file with the following:

leería toda la linea,
si pudiera,

I've tried to read it with the following code:

#include <iostream>
#include <fstream>
#include <string>



int main(){
    std::wstring wstr;
    std::wifstream FILE("data.csv");

    if(FILE){
        getline(FILE, wstr, L',');
        std::wcout << wstr << std::endl;
        FILE.close();
    } else {
        std::cout << "The file failed to open";
    }
    FILE.close();
    return 0;
}

And the Output is:

leer

It seems to stop reading after reaching the 'í'. Why is my code not working ?

Ken White
  • 117,855
  • 13
  • 197
  • 405
Aydnir
  • 13
  • 2
  • What platform is this on? Maybe the problem is with output, rather than input. Also, what is the character encoding of the input file? UTF-8 or some form of wide character encoding? – Paul Sanders May 16 '21 at 00:27
  • @PaulSanders is not the output, i've check while debugging in vs code, and when it reads the file, the variable wstr stores L"leer". I am going to edit the question to clarify that is not the output. – Aydnir May 16 '21 at 00:30
  • @PaulSanders What do you mean by platform ? My OS ? I am on linux Ubuntu. The input file is in UTF-8 – Aydnir May 16 '21 at 00:32
  • OK, that helps. Now please tell me the character encoding of the input file. How did you create it? – Paul Sanders May 16 '21 at 00:33
  • Found a better duplicate, closing. – Paul Sanders May 16 '21 at 00:36
  • The character encoding is unicode(UTF-8), i created the file manually with a text editor – Aydnir May 16 '21 at 00:37
  • Since it's Linux not Windows, the only thing I can think of is that the upper bit is being ignored meaning it stops at 0x8a instead of 0x0a, but that seems unlikely. And I have no idea which encoding would generate 0x8a for that character anyway. – Mark Ransom May 16 '21 at 00:38
  • 1
    Based on the dup, I think you want: `FILE.imbue(std::locale(FILE.getloc(), new std::codecvt_utf8));` If your input file has no BOM, remove `std::consume_header`. – Paul Sanders May 16 '21 at 00:49
  • @PaulSanders Yes!! It worked! i tried it a few seconds ago, thanks! – Aydnir May 16 '21 at 00:55
  • A more common solution: ensure that you `#import `, and insert the following line into your program before you open any file: `std::locale::global(std::locale(""));`. That line means "make the global locale settings for this program be the same as the locale configured when the program was executed". On linux, the configured locale is almost always a UTF-8 locale, so that will correctly read UTF-8 files. It's possible that whoever uses your program has files which don't correspond with their locale. But just assuming that the file is UTF-8 isn't correct in this case either. – rici May 16 '21 at 03:55

0 Answers0