-4

I am looking for a regular expression in R to replace number between 2 alphabetical characters. For example, replace 3 with m, like this:

  • Sa3ple becomes Sample

  • Sample1.3 stays Sample1.3

    • This word statys the same because 3 is not between alphabetical characters

I tried with below R code to replace 3 with m, but it's only working partially.

One issue is that if regex matches, instead of replacing the matching row, every time it is replacing the first row from col3. Not sure, what exactly missing.

df$col3[grep('[a-zA-Z][3][a-zA-Z]|[3][a-zA-Z]',df$col3)] <- gsub('[3]+', 'm', df$col3)
Brandon Minnick
  • 11,396
  • 12
  • 55
  • 108
  • 3
    Please do a minimum amount of research - plain code writing requests are not well received here. See [Reference - What does this regex mean](https://stackoverflow.com/q/22937618/205233) – Filburt Sep 01 '17 at 14:09
  • I could write you the code, but that would teach you that we will just give you answers without you showing proof of having tried anything. Instead, I'll present you all the information you need to complete this task. `(` starts a capture group while `)` closes it (syntax is `(...)` where the ellipsis is obviously replaced with *something* that you want to capture). `\w` matches any *word* character and `\d` matches any *digit*. `+` is a quantifier that specifies the match should include between 1 and unlimited of the previous character selection. – ctwheels Sep 01 '17 at 14:29
  • To continue my previous comment, in regex replacement `$` followed by a digit points to a capture group. So if you have a capture group `(...)`, `$1` will point to the captured contents. If you try to write some regular expressions and you update your question you are more likely to get proper responses as it's show that you've **tried something**. After all, why would we try to help someone that won't even try themselves? Show proof of trial and error and you'll get your answer. – ctwheels Sep 01 '17 at 14:31

1 Answers1

0

regex is hard

pos <- "Sa3ple"
neg <- "Sample1.3"

gsub("([a-zA-z])\\d([a-zA-z])", "\\1m\\2", pos)
"Sample"

gsub("([a-zA-z])\\d([a-zA-z])", "\\1m\\2", neg)
"Sample1.3"

Explanation

(...) is group, which is referenced with \\1, \\2, etc
[a-zA-Z] is lower and uppercase letter (only 1)
\\d is any digit (add + or {2}) to identify more than 1 digit

I use this site to learn

CPak
  • 12,079
  • 2
  • 20
  • 38