1

I have a string that looks something like this:

'som,ething', another, 'thin#'g', 'her,e', gggh*

I am trying to get it to split on the commas that are NOT in the elements, like this:

'som,ething'
another
'thin#'g'
'her,e'
gggh*

I am using parse_line(q{,}, 1, $string) but it seems to fail when the string has single quotes in it. Is there something I'm missing?

  • 1
    @vks This is NOT a duplicate. First of all, the question is for Perl, second of all the question states that the single quotes within the string are the issue - not the commas. – werhgvfwe5r Aug 26 '15 at 11:47
  • Do you actually have mismatched quotes - `'thin'g',` - that makes the problem considerably harder (how would you tell which field the following comma belongs to?) – Sobrique Aug 26 '15 at 12:04
  • @werhgvfwe5r, all you really need is the regexp from linked question. The Perl wrapper around it is absolutely trivial. – Oleg V. Volkov Aug 26 '15 at 12:28
  • @Sobrique, yes, sadly the quotes are mismatched. However, I made a small mistake in my original question. The quotes that are in quotes are usually escaped with a special character (`#`). Updating the question – werhgvfwe5r Aug 26 '15 at 12:48

3 Answers3

1
#!/usr/bin/perl
use strict;
use warnings;
my $string = q{'som,ething', another, 'thin'g', 'her,e', gggh*};
my @splitted = split(/,(?=\s+)/, $string);
print $_."\n" foreach @splitted;

Output:

'som,ething'
 another
 'thin'g'
 'her,e'
 gggh*

Demo

Chankey Pathak
  • 19,330
  • 10
  • 72
  • 119
  • Uh, while it "works" with this particular string, it obviously doesn't match task described in question: `q{'som , ething', another, 'thin'g', 'her,e', gggh*}` = fail. – Oleg V. Volkov Aug 26 '15 at 12:31
  • Forgot to mention that inner quotes are escaped by a hash (`#`). I assume that might make things easier. – werhgvfwe5r Aug 26 '15 at 12:49
  • This solution also assumes that the commas are all followed by a space. It's possible there is none: `a word,'another',field, here` – werhgvfwe5r Aug 26 '15 at 12:51
0

It looks like you're trying to parse comma-separated values. The answer is to use Text::CSV_XS since that handles the various weird cases you're likely to find in the data. See How can I parse quoted CSV in Perl with a regex?

Community
  • 1
  • 1
brian d foy
  • 121,466
  • 31
  • 192
  • 551
-1

Using split is not the way to go. If you are sure your string is well formatted using a global match is more simple, example:

my $line = "'som,ething', another , 'thin#'g', 'her,e' , gggh*";

my @list = $line =~ /\s*('[^#']*(?:#.[^#']*)*+'|[^,]+(?<=\S))\s*/g;

print join("|", @list);

(the (?<=\S) is only here to trim items on the right)

Casimir et Hippolyte
  • 83,228
  • 5
  • 85
  • 113