-1

Hello I have a df such as

COL1
SEQ_1.1_0
SEQ2.2_2
AB_1_2.3_3
ACC.3_3

and I would like to strsplit it by the last "number_"

and get :

COL1      COL2
SEQ_1.1   0
SEQ2.2    2
AB_1_2.3  3
ACC.3     3

so far I tried:

strsplit(df$COL1, "*.[0-9]_")

here is the code I use and need to use

df$shorti = do.call(rbind, strsplit(as.character(df$COL1), "*.[0-9]_"))[,1]
chippycentra
  • 1,719
  • 3
  • 10

3 Answers3

2

Using tidyr::extract :

tidyr::extract(df, COL1, c('COL1', 'COL2'), regex = '(.*)_(.*)', convert = TRUE)

#      COL1 COL2
#1  SEQ_1.1    0
#2   SEQ2.2    2
#3 AB_1_2.3    3
#4    ACC.3    3

With strsplit using regex from here with negative lookahead.

result <- do.call(rbind, strsplit(df$COL1, '(_)(?!.*_)', perl = TRUE))
Ronak Shah
  • 286,338
  • 16
  • 97
  • 143
2

Using substr:

> dat                  
        COL1
1  SEQ_1.1_0
2   SEQ2.2_2
3 AB_1_2.3_3
4    ACC.3_3
> dat$COl2 <- substr(dat$COL1,nchar(dat$COL1),nchar(dat$COL1)+1)
> dat$COL1 <- substr(dat$COL1,1,nchar(dat$COL1)-2)
> dat
      COL1 COl2
1  SEQ_1.1    0
2   SEQ2.2    2
3 AB_1_2.3    3
4    ACC.3    3
> 
Karthik S
  • 7,798
  • 2
  • 6
  • 20
1

Here's a base Rsolution with sub:

Data:

df <- data.frame(
  COL1 = c("SEQ_1.1_0",
  "SEQ2.2_2",
  "AB_1_2.3_3",
  "ACC.3_3")
)

Solution:

df$COL2 <- sub(".*(\\d$)", "\\1", df$COL1) 
df$COL1 <- sub("_\\d$", "", df$COL1)

Result:

df
      COL1 COL2
1  SEQ_1.1    0
2   SEQ2.2    2
3 AB_1_2.3    3
4    ACC.3    3
Chris Ruehlemann
  • 10,258
  • 2
  • 9
  • 18