1

I'm having trouble coming up with the regex I need to do this find/replace in Notepad++. I'm fine with needing a couple of separate searches to complete the process.

Basically I need to add a | at the beginning and end of every line from a CSV, plus replace all the , with |. Then, on any value with only 1 character, I need to put two spaces around the character on each side ("A" becomes " A ")

Source:

col1,col2,col3,col4,col5,col6
name,desc,something,else,here,too
another,,three,,,
single,characters,here,a,b,c
last,line,here,,almost,

Results:

|col1|col2|col3|col4|col5|col6|
|name|desc|something|else|here|too|
|another||three||||
|single|characters|here|  a  |  b  |  c  |
|last|line|here||almost||

Adding the | to the beginning and the end of the line is simple enough, and replacing , with | is obviously straightforward. But I can't come up with the regex to find |x| where x is limited to a single character. I'm sure it is simple, but I'm new to regex.

MSCF
  • 117
  • 3
  • 13

3 Answers3

4

Regex:

(?:(^)|(?!^)\G)(?:([^\r\n,]{2,})|([^\r\n,]))?(?:(,$)|(,)|($))

Replacement string:

(?{1}|)(?{2}\2)(?{3}  \3  )(?{4}||)(?{5}|)(?{6}|)

Ugly, dirty and long but works.

Regex Explanation:

(?:                 # Start of non-capturing group (a)
    (^)                 # Assert beginning of line (CP #1)
    |                   # Or
    (?!^)               # //
    \G                  # Match at previous matched position
)                   # End of non-capturing group (a)

(?:                 # Start of non-capturing group (b)
    ([^\r\n,]{2,})      # Match characters with more than 2-char length (any except \r, \n or `,`) (CP #2)
    |                   # Or
    ([^\r\n,])          # Match one-char string (CP #3)
)?                  # Optional - End of non-capturing group (b)

(?:                 # Start of non-capturing group (c) 
    (,$)                # Match `,$` (CP #4)
    |                   # Or
    (,)                 # Match single comma (CP #5)
    |                   # Or
    ($)                 # Assert end of line (CP #6)
)                   # End of non-capturing group (c) 
revo
  • 43,830
  • 14
  • 67
  • 109
  • 2:30AM here I'm not fresh to add explanation. But I'll do asap. – revo Oct 08 '16 at 23:10
  • 1
    +1 for being a one-step-solution, though I think a multi-step-solution is much more readable (and could be used as a macro) – Sebastian Proske Oct 08 '16 at 23:13
  • For the most part that works. And being one step, while confusing, is efficient. But it seems to fail if there is non-alphanumeric characters in the text. For instance, if `some-value` is in there, it stops at the `-` and moves on to the next line. Tried to work out how to modify it, but I'm not knowledgeable enough in regex to figure out how to do it. – MSCF Oct 09 '16 at 02:27
  • 1
    @MSCF It's a matter of changing `\w` to `[^\r\n,]`. Please check update. – revo Oct 09 '16 at 10:09
  • 1
    Works like a champ. And thank you for the explanation. I can see how it works now. I need to study up on regex. – MSCF Oct 09 '16 at 21:05
1

The first replace adds | at the beginning and at the end, and replaces commas:

Search: ^|$|,
Replace: |

The second replace adds space around single character matches:

Search: (?<=[|])([^|])(?=[|])
Replace:   $1  

Add spaces to the left and to the right of $1.

Sergey Kalinichenko
  • 675,664
  • 71
  • 998
  • 1,399
1

Three Step Solution:

  • Pattern: ^.+$ Replacement: |$0|
  • Pattern: , Replacement: |
  • Pattern: (?<=\|)([^|\r\n])(?=\|) Replacement: $0
Sebastian Proske
  • 7,985
  • 2
  • 26
  • 36