12

I'm using the excellent FileHelpers library to import many csv files, but run into a problem. I have a csv file with these three example lines

id,text,number
120,"good line this one",789
121,""not good" line", 4456
122,,5446

and this (example) class

  [IgnoreFirst(1)]
  [IgnoreEmptyLines()]
  [DelimitedRecord(",")]
  public sealed class JOURNAL
  {
    public Int32 ID;

    [FieldQuoted('"', QuoteMode.AlwaysQuoted, MultilineMode.NotAllow)]
    public string TEXT;

    public Int32? NUMBER;
  }

The problem with QuoteMode.AlwaysQuoted is that ID 122 will fail with error:

The field 'TEXT' not begin with the QuotedChar at line 3. You can use FieldQuoted(QuoteMode.OptionalForRead) to allow optional quoted field

Switching to QuoteMode.OptionalForRead will fail with an error for id 121:

The field TEXT is quoted but the quoted char: " not is just before the separator (You can use [FieldTrim] to avoid this error)

So how can I handle a csv that has empty fields with no quotes AND quoted text fields with extra quotes in the text?

edosoft
  • 16,270
  • 24
  • 75
  • 108

1 Answers1

9

That looks like a case that we don't support, let me add a test case and make it work in both modes, for the first one we need to validate if semantic is correct, ie. if QuoteMode.AlwaysQuoted can allow ,, or must be ,"", but the second option must work :) Cheers

Marcos Meli
  • 3,338
  • 21
  • 29
  • It's so great to get an answer for the developer here on StackOverflow :) Just to clarify, will you modify FileHelpers so that QuoteMode.AlwaysQuoted will allow both ,, and ,"", OR will you modify so that QuoteMode.OptionalForRead will allow ,""bad string" he said", ie nested quotes? – edosoft Feb 22 '11 at 15:34
  • I will be fix the QuoteMode.OptionalForRead for sure and we can analyze together if the first scenario must be allowed too (Thinking a bit more, the semantic of AlwaysQuoted must not allow ,, ) What do you think ? – Marcos Meli Feb 22 '11 at 16:47
  • I agree that ,, should not be allowed with AlwaysQuoted. Always suggests, well, always :) – edosoft Feb 22 '11 at 18:10
  • 3
    After checking the code and adding a TestCase the problem is that the " inside the quoted field must be scaped, ie. must appear twice like the @"" in .net. In your lines In your example must be: 121,"""not good"" line", 4456 Anyway we can add a new Mode or parameter to allow a more relaxed check of quoted string A problem could be for example 121,""not good", line", 4456 – Marcos Meli Feb 22 '11 at 20:39
  • 1
    Was any mod added for this? I have the same problem with the following fields - [""test L"] and [""text1" text2"] and ["text""] and ["txt "t""] – Mr Shoubs Apr 13 '12 at 14:54