1

I have a comma separated list as shown below. The list is actually on one line, but I have split it up to demonstrate the syntax and that each single unit contains 5 elements. There is no comma at the end of the list

ro:2581,1309531682152,A,Place,Page,
me:2642,1310989368864,A,Place,Page,
uk:2556,1309267095061,A,Place,Page,
me:2642,1310989380238,D,Place,Page,
me:2642,1334659643627,D,Place,Page,
ro:3562,1378721526696,A,Place,Page,
uk:1319,1309337246675,D,Place,Page,
ro:2581,1379500694666,D,Place,Page,
uk:1319,1309337246675,A,Place,Page

What I am trying to do is remove any unit (full line) that does not begin with uk:. I.e., the results will be:

uk:2556,1309267095061,A,Place,Page,
uk:1319,1309337246675,D,Place,Page,
uk:1319,1309337246675,A,Place,Page

If the string was on separate lines as my example, I could do this relatively easy, but because it is all on one line, I cannot get it to work. Can anyone point me in the right direction?

Thanks

Typhoon101
  • 1,647
  • 7
  • 24
  • 43
  • Just to get it clear,is your input looks like something like this?: **ro:2581,1309531682152,A,Place,Pageme:2642,1310989368864,A,Place,Page** (note: there is no comma between "page" and "me") – nafas Aug 22 '14 at 13:34
  • Why do you need regex solution and what tool/platform are you using for this? – anubhava Aug 22 '14 at 13:35
  • Lot's of confusing negations in your description. 'Doesn't contain' and 'remove.. that does not begin with'. You just plainly want all "rows" that begin with uk right? – KekuSemau Aug 22 '14 at 13:37
  • @nafas. There IS a comma between "page" and "me". My actual string is ro:2581,1309531682152,A,Place,Page,me:2642,1310989368864,A,Place,Page,uk:2556,1309267095061,A,Place,Page,me:2642,1310989380238,D,Place,Page,me:2642,1334659643627,D,Place,Page,ro:3562,1378721526696,A,Place,Page,uk:1319,1309337246675,D,Place,Page,ro:2581,1379500694666,D,Place,Page,uk:1319,1309337246675,A,Place,Page – Typhoon101 Aug 22 '14 at 13:37
  • I'm not sure I understand your question correctly. But maybe you're looking for something like this: `\b(?!uk)[a-z]+:\d+,\d+,[a-z]+,[a-z]+,[a-z]+,`. [See demo](http://regex101.com/r/jS1sS8/2). – Amal Murali Aug 22 '14 at 13:38
  • possible duplicate of [Regular expression to match string not containing a word?](http://stackoverflow.com/questions/406230/regular-expression-to-match-string-not-containing-a-word) – Joe Aug 22 '14 at 13:38
  • Check this out http://regex101.com/r/vG6gW3/1 – hex494D49 Aug 22 '14 at 13:38
  • @hex494D49 OP said the string is all one line. It was split up in the question for readability. – RevanProdigalKnight Aug 22 '14 at 13:39
  • @Joe: How is that a duplicate of this question? Only the title seems to be the same. – Amal Murali Aug 22 '14 at 13:39
  • @Typhoon101 ok, then the answer by **Revan** should do the trick for you. – nafas Aug 22 '14 at 13:39

2 Answers2

3

This should work:

(uk:\d+,\d+,\w,\w+,\w+)

Demo

It looks for uk: and then it's pretty much comma-counting from there on.

EDIT:

Since OP has now clarified that what they're using can only remove strings:

,?[^u][^k]:\d+,\d+,\w,\w+,\w+

Demo 2

This looks for an optional comma followed by two letters that are not u and not k in that order, then a colon (:), and then the rest of the regex is the same.

RevanProdigalKnight
  • 1,288
  • 1
  • 14
  • 23
  • It seems a big chunk of my original question has somehow been removed, which clearly has caused a bit of confusion. @RevanProdigalKnight, this is the closest answer. What is missing from the question is, I am using a custom language of my CMS, which only allows me to "remove" matched strings. Therefore, I actually need to match anything that DOES NOT begin with uk:, so I can remove it from the original. This will leave the lines that DO begin with uk:. In short, I need the opposite of this demo. I could probably use ((ro|me):\d+,\d+,\w,\w+,\w+), but in real life, there will be other values. – Typhoon101 Aug 22 '14 at 13:51
  • @Typhoon101 I've added a regex that should handle only removing the cases that don't begin with `uk:`. – RevanProdigalKnight Aug 22 '14 at 13:56
  • That is perfect. Thanks for your help – Typhoon101 Aug 22 '14 at 13:58
0

I would suggest a simple regex like this:

(\buk:.+?,Page)(?:,|$)

and grab matched group #1

RegEx Demo

Community
  • 1
  • 1
anubhava
  • 664,788
  • 59
  • 469
  • 547