2

In PCRE how to find dashes between words

e.g.

First-file-111-222.txt
This-is-the-second-file-123-456.txt
And-the-last-one-66-77.txt

So, the dash between First and and File (etc)

Then I could replace them with a space.

With ([^a-zA-Z]\d(.+)) I could select the last part (dash+nbrs) but I don't know how to mark the other dashes.

== edit the idea is to use a renamer tool (supporting regular expressions) - the rename would then to result in

First file-111-222.txt
This is the second file-123-456.txt
And the last one-66-77.txt

so, the - after the last word and between the numbers to be kept in place. only the ones between words to be replaced.

  • I guess, you missed the global flag in your replace. I am not sure which language you are using for replace. e.g. in sed we use it as follows: `sed s/find/replace/g` i.e. `sed 's/-/ /g`. The `g` at the end tels that ALL occurences of the `'-'` should be replaced with `' '`. – anishsane Jun 02 '13 at 07:11

4 Answers4

1

Use look arounds:

(?i)(?<=[a-z])-(?=[a-z])

This matches dashes that have a letter preceding and a letter following.

Bohemian
  • 365,064
  • 84
  • 522
  • 658
  • Thank you again. Actually as far as I can see both are doing the same. Frankly, as a beginner, for me, either one is magic and I need to study them both to see what is actually happening. I wouldn't have figured this myself, though I did spend quite a lot time in trying to figure this out myself. Anyway, thanks again! (This is my first post on stackoverflow, not sure whether I can tag 2 answers as a solution) – user2243577 Jun 03 '13 at 12:48
  • The difference is that "non-digits" is not the same as "letters". Eg if a filename was `"abc-xyz-.txt"` and you used the other regex, you'd get `"abc xyx .txt"`, because `.` is a "non-digit", but with my regex you'd get `"abc xyz-.txt"` as desired. If you are confident that you won't get such occurrences, use the other regex because it's simpler. If you want to be safe, use mine. And no, you can't accept two answers :) – Bohemian Jun 03 '13 at 13:34
1

If I'm not missing anything following regex should work for you:

(?<=\D)-(?=\D)

It just means find a hyphen char if it is between 2 non-digit characters.

Live Demo: http://www.rubular.com/r/O2XUNaB02R

Community
  • 1
  • 1
anubhava
  • 664,788
  • 59
  • 469
  • 547
  • Many thanks! This is MAGIC ! Really wonderful. Wouldn't be able to figure this out myself (newbie). Again, many thanks indeed!! – user2243577 Jun 02 '13 at 14:42
0

You need to enable global mode which will find and replace every occurrence in a matching text. Here is an example: http://www.regex101.com/r/hG5rX8 (note the g in the options).

As the actual regex goes, something simple as \w-\w should suffice to get the dash.

mart1n
  • 5,135
  • 5
  • 35
  • 72
0

Is it just the dashes that you want to work on? If so, the code below should do the trick, assuming you have your input in a file called foo

perl -pe "s/-/ /g" < foo

That'll give this output:

First file 111 222.txt
This is the second file 123 456.txt
And the last one 66 77.txt

The preceding s marks that the regex is to be used to make a substitution, and the trailing g says that it's a global replacement, so the interpreter should not stop after the first match that it finds.

chooban
  • 8,196
  • 2
  • 18
  • 34