-2

Hello there i am having trouble building a regular expression for this strings

TOWN                 ALe   Alx   Aus   Bau   Bem   Bra   Clq   Crk   DLk

AlbertLea              -     -    22     -     -     -     -     -     -     -

What i want is to split the string including all the spaces between each token not a space for example the split string array will be something like this

[TOWN                 ,ALe   ,Alx   ,Aus   ,Bau   ,Bem   ,Bra   ,Clq   ,Crk   ,DLk]



[AlbertLea              ,-     ,-    ,22     ,-     ,-     ,-     ,-     ,-     ,-     ]

thank you.

Adam lazar
  • 49
  • 6
  • 1
    Possibly a simple [**`[\t ]+`**](https://regex101.com/r/uDaczB/1/) would do. You need to double escape backslashes in `Java`, so that it becomes `[\\t ]+`. – Jan Oct 22 '17 at 15:29
  • 1
    @Jan Maybe `\s+` would be better, as it matches every kind of whitespace – BackSlash Oct 22 '17 at 15:34
  • the expression [\\t ]+ will split after it counter a tab what i have is single random spaces between each token – Adam lazar Oct 22 '17 at 15:43

2 Answers2

0

Use a look ahead, with a look behind:

(?<=\s)(?=\S)

The look behind (?<=\s) matches just after a whitespace character.

The look ahead (?=\S) matches just before a non-whitespace character.

The combination matches between whitespace and non-whitespace characters.

See live regex demo, and live java demo.

Bohemian
  • 365,064
  • 84
  • 522
  • 658
  • Lookarounds are very expensive, see https://regex101.com/r/uDaczB/1 (92 steps) vs. https://regex101.com/r/uDaczB/2 (788 steps), that is 7x times more! (and not really needed here). – Jan Oct 22 '17 at 15:50
  • thank you so much that's what is was looking for. Can you please explain to me what look ahead, and look behind does? – Adam lazar Oct 22 '17 at 15:51
  • @Adam a look ahead *asserts* with consuming input that the input that follows matches the look ahead's regex. Same for look behinds, but they assert what preceded. – Bohemian Oct 22 '17 at 15:52
  • @Jan we're talking microseconds here. You wouldn't be able to notice the difference. Get over it and move on to something else. – Bohemian Oct 22 '17 at 15:54
-1

Rather than split on what you don't want you can capture what you do want more easily / efficiently:

/(\S+[ \t]+)/

Demo

Not that this captures spaces not including the \n\n you have in your example.

dawg
  • 80,841
  • 17
  • 117
  • 187