1

I am trying to write a regex that will allow me to parse CSV files that excel creates. I have noticed when you export a CSV from excel, if the field is a string it will encase it in quotes. If that string contains quotes itself, it will escape each quote with a quote!!

What I want to do is split each line that I parse into fields. In light of the above, I have to split when there is a comma that is not within quotes. My regex is terrible, so how would I do this?

I can split by a comma, but how do I say when its not in between quotes??

$lines = file($toce_path);

foreach ($lines as $line) {

    $line_array = preg_split("/,/", $line);

    $test = "($line_array[0], $line_array[1], $line_array[2])";

    echo $test.'<br />';

} 

This question is exactly like mine but it doesn't work with preg_split. Preg_split requires Perl-compatible regular expression syntax.

Thanks all for any help

Community
  • 1
  • 1
Abs
  • 51,038
  • 92
  • 260
  • 394
  • 1
    Is it possible to use a proper CSV parser instead? – Michael Myers Aug 09 '10 at 14:29
  • 1
    The other question's regex seems Perl-compatible to me. I'd be surprised if it didn't work. – Dan Breen Aug 09 '10 at 14:32
  • @mmyers - I am not sure which ones are available? I did try to find something within PHP but I read a few comments that fgetcsv doesn't work well with CSV's created by excel. But I am so close with this, I am just hopeless at regex. – Abs Aug 09 '10 at 14:33
  • @Dan - I kept getting a no delimiter error, but I just encased it with `/` and it seems to be working! Damn it. – Abs Aug 09 '10 at 14:35
  • possible duplicate of [Java: splitting a comma-separated string but ignoring commas in quotes](http://stackoverflow.com/questions/1757065/java-splitting-a-comma-separated-string-but-ignoring-commas-in-quotes) – Abs Aug 09 '10 at 14:36

3 Answers3

4

Not exactly answering your question, but maybe solving your problem:

Have you tried fgetcsv() or str_getcsv()?

They're your best friends if you're dealing with CSV data.

timdev
  • 57,660
  • 6
  • 74
  • 90
1

Why don't you use php's built-in function?

http://php.net/manual/en/function.fgetcsv.php

ghoppe
  • 19,804
  • 3
  • 26
  • 20
0

This expression works with .NET, which is supposed to be Perl compatible: (?<!\"\w*),

Input: some, "text, here" returns the match only on the comma after some.

AllenG
  • 7,946
  • 26
  • 38
  • 1
    That would also split for `"multiple words, here"`, but `"wont"split,here`. There are ways to trick regex into finding tokens between quotes, but this isn't a good one, I'm afraid. – Kobi Aug 09 '10 at 16:53