1

I'm quite new to C++, but I'm used to some coding with R language. I started, a few weeks ago, to put together a small application that should copy and rename file pairs (.seq/.ab1). Result from a DNA sequencer analysis (renaming hundreds of them manually would be a real time waste, specially because we have lists with their new names).

Everything seemed to be fine, but the new files (those copied) appear with a "special character" in their names (right before the file type), it seeems like a space, but its not (I've replaced it with a space, and the file opened correctly). After deleting it the file can be oppened by its associated application, but with it, the aplication acusses the file to be corrupted.

The issue seems to come from the code related to ostringstream::str member function, but I honestly don't know how to fix it. I wonder if its not inserting a null character there, before I append the file type...

Here is the part of the code responsible. It gets the old and new names from a 2 column csv file, data separated by ";". Original data, and new (renamed files) data are kept in diferent directories, thats the reason I need to create a string with each file path inside a for loop. I intend to check old and new files content later, probably with memcmp. But first I need them to be correctly renamed.

I'm on a Ubuntu 14.04 (64 bit) machine with gcc 4.8.4 as compiler. I already excuse myself for the probably poor coding and bad english, I'm not a native speaker (writer, actually).

    fNew.open(filename);
    std::ostringstream oldSeqName (std::ostringstream::ate);
    std::ostringstream newSeqName (std::ostringstream::ate);
    std::ostringstream oldAb1Name (std::ostringstream::ate);
    std::ostringstream newAb1Name (std::ostringstream::ate);

    std::fstream log;
    time_t now = time(0);

    for (std::string nOld, nNew; getline(fNew, nOld, ';') && getline(fNew, nNew); )
    {
        std::cout << "Old Name: " << nOld << " -> New Name: " << nNew << std::endl;

        // Keep a log of the name changes
        log.open("NameChangesLog.txt", std::fstream::out | std::fstream::app);
        log << ctime(&now) << " - " <<  "Old Name: " << nOld << " -> New Name: " << nNew << std::endl;
        log.close();

        // Create old seq files paths string
        oldSeqName.str(nOld);
        oldSeqName << ".seq";
        std::string osn = "./Seq/" + oldSeqName.str();

        // Create new seq files paths string
        newSeqName.str(nNew);
        newSeqName << ".seq";
        std::string nsn = "./renamed/" + newSeqName.str();

        std::ifstream ifseq(osn, std::ios::binary);
        std::ofstream ofseq(nsn, std::ios::binary);

        ofseq << ifseq.rdbuf();

        ifseq.close();
        ofseq.close();

        // Create old ab1 files paths string
        oldAb1Name.str(nOld);
        oldAb1Name << ".ab1";
        std::string oan = "./Seq/" + oldAb1Name.str();

        // Create new abq files paths string
        newAb1Name.str(nNew);
        newAb1Name << ".ab1";
        std::string nan = "./renamed/" + newAb1Name.str();

        std::ifstream ifab1(oan, std::ios::binary);
        std::ofstream ofab1(nan, std::ios::binary);

        ofab1 << ifab1.rdbuf();

        ifab1.close();
        ofab1.close();

    }

    fNew.close();
Rodrigo
  • 53
  • 6
  • By casting a char to int you could check what is the ASCII code of this additional character. It would give you some information whether it is always the same character or something random. – Ardavel Dec 22 '15 at 13:21
  • 1
    I have no answer to your problem but just want to remark that this seems better suited to a little shell or perl script rather than C++. Fiddle with `sed` and `basename`. – Peter - Reinstate Monica Dec 22 '15 at 14:01
  • Yep, it would be nice as a programming exercise, but C++ is a huge overkill for such a problem. And if there is a language that makes it easy for novice programmers to make silly mistakes, it's likely that C++ would fit the bill nicely. And you probably don't want to make such mistakes with production data. – Maarten Bodewes Dec 22 '15 at 14:08
  • An old professor of mine once said that the answer may be simplier than we expect. @Arkadiy is completely right. The files were generated under windows. Converting them or making it at Ubuntu solved the issue. Thank you very much. – Rodrigo Dec 22 '15 at 14:30

2 Answers2

1

Is the list file prepared on Windows machine? In that case it would have DOS line ending (\r\n) and is not well suited for getline on Unix. The character you see is likely \r. Make sure you use dos2unix utility before feeding the list file to your program

  • Or open the stream in text mode instead of binary. It makes a difference on Windows. – Zan Lynx Dec 22 '15 at 17:24
  • Text mode will not help when you run on Unix. Unix text mode does not know about \r –  Dec 22 '15 at 17:49
0

You probably forget to trim the values returned from getline, so they may still contain whitespace. Whitespace may be tricky to pick up by the application.

Community
  • 1
  • 1
Maarten Bodewes
  • 80,169
  • 13
  • 121
  • 225