I have the following string:
str <- "add2AHJJK_GLX_KLKNKMEMa13"
How can I use R to extract "GLX" from it, the word between the underscores? In the example, there are exactly two underscores, not more.
I have the following string:
str <- "add2AHJJK_GLX_KLKNKMEMa13"
How can I use R to extract "GLX" from it, the word between the underscores? In the example, there are exactly two underscores, not more.
An option with gsub
to match characters that are not a _
([^_]*
) from the start (^
) of the string to the _
or (|
) characters from _
to the rest and replace with blank (""
)
gsub("^[^_]*_|_.*", "", str)
#[1] "GLX"
Or another option is extraction with regexpr/regmatches
regmatches(str, regexpr('(?<=_)\\w+(?=_)', str, perl = TRUE))
#[1] "GLX"
If it's always just the middle of three parts between "_"s we can.
library(stringr)
str_split(str, "_", simplify = TRUE)[[2]]
[1] "GLX"
You can use sub
to extract a word between underscores.
sub('.*_(\\w+)_.*', '\\1', str)
#[1] "GLX"
Or str_match
:
stringr::str_match(str, '_(\\w+)_')[, 2]