0

I thought sub() function only replace the first time it found with a matched pattern.

However, in this example, it seems the sub() function replaced all the matched patterns. Can someone explain it?

awards <- c("Won 1 Oscar.",
  "Won 1 Oscar. Another 9 wins & 24 nominations.",
  "1 win and 2 nominations.",
  "2 wins & 3 nominations.",
  "Nominated for 2 Golden Globes. 1 more win & 2 nominations.",
  "4 wins & 1 nomination.")

sub(".*\\s([0-9]+)\\snomination.*$", "\\1", awards)

Outcome :

[1] "Won 1 Oscar." "24"           "2"            "3"            "2"           
[6] "1"

I expect the Outcome will be:

[1] "Won 1 Oscar." "24"  "1 win and 2 nominations.",
  "2 wins & 3 nominations.",
  "Nominated for 2 Golden Globes. 1 more win & 2 nominations.",
  [6]"4 wins & 1 nomination."

  • 1
    FYI, per https://stackoverflow.com/editing-help: code blocks start with triple-backtick (not single quotes) and optionally syntax hint, and then newline (no code on the same line). And it ends with triple-backticks and a newline. Never are single-quotes usable for code-fences. Never can code be on the same line. I've edited your question to reflect this; please take a look and use something like in future questions -- it makes a difference when reading the question. Thanks! – r2evans Jun 30 '20 at 20:18
  • 1
    Your `.*` is being *greedy* (in regex terms). What is your expected output? – r2evans Jun 30 '20 at 20:20
  • What is your expected output? – akrun Jun 30 '20 at 20:21
  • doesn't ".*" mean "any character that is matched zero or more times". – NomardicRoku Jun 30 '20 at 20:31
  • NomardicRoku, perhaps it was done hastily, but your desired outcome only changes the second element out of the six. That inconsistency is only clouding the water, so to speak. Perhaps you are needing `stringr::str_extract_all(awards, "\\b([0-9]+)(?=\\s?nominations?)")`? – r2evans Jun 30 '20 at 22:03

0 Answers0