1

Hoping to get some guidance as only an occasional analyst and couldn't really understand how to manage an expression with a preceding numeric value.

My data below, I am hoping to convert the "4D" and "5D" type of data into "4 Door" and "5 Door".

a <- c("4D Sedan", "5D Wagon")
b <- c("4 Door Sedan", "5 Door Wagon")
dt <- cbind(a,b)

thanks.

Jaap
  • 71,900
  • 30
  • 164
  • 175
Amit Verma
  • 85
  • 1
  • 1
  • 6

1 Answers1

3

We can use gsub() here, searching for the pattern:

\\b(\\d+)D\\b

and replacing it with:

\\1 Door

Code:

a <- c("4D Sedan", "5D Wagon", "AB4D car 5D")
> gsub("\\b(\\d+)D\\b", "\\1 Door", a)
[1] "4 Door Sedan"    "5 Door Wagon"    "AB4D car 5 Door"

Note in the above example that the 4D in AB4D car 5D does not get replaced, nor would we want this to happen. By using word boundaries in \\b(\\d+)D\\b we can avoid unwanted replacements from happening.

Tim Biegeleisen
  • 387,723
  • 20
  • 200
  • 263
  • 1
    thanks, just so that I can understand fully, in the replacement "\\1 Door", does the "1" stand for any numerical value? – Amit Verma Jun 30 '17 at 05:48
  • No, `\\1` is the first _capture group_, which is the quantity in parentheses in the pattern we used to search, i.e. `(\\d+)`. In other words, it is the _number_ in front of `D`, assuming we matched that pattern. – Tim Biegeleisen Jun 30 '17 at 05:48