3

I have this code:

var lines = this.result.split('\n');
for (var line = 0; line < lines.length; line++) {
    console.log(lines[line]);
    var sublines = lines[line].split(' ');
    for (var subline = 0; subline < sublines.length; subline++) {
        console.log(sublines[subline]);
    }
}

which I would hope that it would extract the tokens from the string and then parse the integer from every token, but it seems like split(" ") is not what will work here!

Here is what I see:

(0, (u'5643145391', u'11367866245'))

getting consoled twice, which should mean that no split is done. In my real data the list with the big numbers is 150 in length, but that shouldn't matter.

How to to split that (partially unicode) string?


jsFiddle to reproduce the issue. However it seems that the fiddle prints the strings only once, without annotating how many times this message printed.


Desired output would be, a line per iteration:

0
5643145391
11367866245

but anything close to this would be appreciated.

gsamaras
  • 66,800
  • 33
  • 152
  • 256

2 Answers2

3

This seems like something that a RegExp could be useful for,

(                              // begin capture group
  -?                           // match 0 or 1 minus sign
  \d{1,}                       // match 1 to unlimited digits
)                              // end capture group

var line = "(0, (u'5643145391', u'11367866245'))", 
    regex = /(-?\d{1,})/g;

console.log( line.match(regex) );
jdphenix
  • 13,519
  • 3
  • 37
  • 68
  • I am getting an invalid syntax for `=>`, any idea? – gsamaras Sep 10 '16 at 05:22
  • It depends on what you actually what you want to do with it. I've fed it to `console.log` to display it. Use `line.match(regex).forEach(console.log)` to do the same outside of the Stack snippet here. – jdphenix Sep 10 '16 at 05:24
  • The invalid syntax error is caused by the environment you're running it in not supporting ES2015 arrow functions, which isn't really a problem here. I'll post a fiddle without it. – jdphenix Sep 10 '16 at 05:25
  • What environment are you running your script in? – jdphenix Sep 10 '16 at 05:27
  • As far as I know, `String.prototype.match` has been around since antiquity. That's quite odd... – jdphenix Sep 10 '16 at 05:30
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/123032/discussion-between-jdphenix-and-gsamaras). – jdphenix Sep 10 '16 at 05:31
  • All good, my bad. Can you explain/teach me a bit on this regex? How did you come up with it? :) – gsamaras Sep 10 '16 at 05:35
  • I've spent a fair amount of time turning evil Governmentese data files into slightly less evil Governmentese data files. It's just lots of practice with them – jdphenix Sep 10 '16 at 05:36
  • It paid off! ;) Explain me only the `-?`, I don't get your comment all.. :/ – gsamaras Sep 10 '16 at 05:38
  • Ahh. It means that a match will happen, *even if there is a minus sign immediately preceding a number*. This will handle negative integers. It's important to note that my regular expression assumes that by "integer", you mean "a positive or negative integer in base-10 literal format with no grouping delimiters and no extra whitespace" – jdphenix Sep 10 '16 at 05:42
  • Also, https://regex101.com/#javascript is a resource I use putting together more complex regular expressions. – jdphenix Sep 10 '16 at 05:43
  • These babies are IDs of 15T data, all non-negative! Great, I know understand, thanks, [time to understand caching now](http://stackoverflow.com/questions/39422370/understanding-caching). ;) I have seen that, tnx! – gsamaras Sep 10 '16 at 05:44
  • I am trying to make this regex more intelligent in [Intelligent regex to understand input](http://stackoverflow.com/questions/39477126/intelligent-regex-to-understand-input), if anyone is interested... :) – gsamaras Sep 13 '16 at 18:56
2

you can first replace any char that isn't number from your string with a ' ' and then split by it

var lines = document.getElementById("demo");
var lines = lines.innerText;
lines = lines.replace(/[^0-9\.]+/g, ' ');
lines =  lines.trim();
res = lines.split(' ');
console.log(res);

or in one line

lines.replace(/[^0-9\.]+/g, ' ').trim().split(' ');
naortor
  • 1,911
  • 11
  • 21