0

I have three dataframes that each have a rn_description column and a label column. For each dataframe, I want to read the text in the description column, and depending if grepl finds certain strings, I want to label the label column appropriately.

I've tried

l=list(df1_wlabel= df1_wlabel,df2_wlabel=df2_wlabel,df3_wlabel=df3_wlabel)


l <- lapply(l , 
            function(df){
  
                df$label <- ifelse(grepl("email|e-mail", df$rn_description), "email", df$label)
                df$label <- ifelse(grepl("phone|call|voicemail|conversation|convo", 
                             df$rn_description), "phone", df$label)
                df$label <- ifelse(grepl("text", df$rn_description), "text", df$label)
                return(df)
             }     
     )

I've also tried a for loop

for (i in 1:length(l)){    
                l[i]$label <- ifelse(grepl("email|e-mail", l[i]$rn_description), "email", l[i]$label)
                l[i]$label <- ifelse(grepl("phone|call|voicemail|conversation|convo", 
                               l[i]$rn_description), "phone", l[i]$label)
                l[i]$label <- ifelse(grepl("text", l[i]$rn_description), "text", l[i]$label)
}

But none of these ways properly changes the label column. Only explicitly writing out the changes for each dataframe works like so:

df1_wlabel$label <- ifelse(grepl("email|e-mail", df1_wlabel$rn_description), "email", df1_wlabel$label)
df1_wlabel$label <- ifelse(grepl("phone|call|voicemail|conversation|convo", df1_wlabel$rn_description), "phone", df1_wlabel$label)

df1_wlabel$label <- ifelse(grepl("text", df1_wlabel$rn_description), "text", df1_wlabel$label)
#then do this again for each dataframe

Why do the first two ways not work, but the last way works?

lvnwrth
  • 127
  • 7
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. After the code runs, are you checking the values in the list `l`? Because that's what your code is updating. You are creating (lazy) copies of the data.frames when you create the list. – MrFlick Oct 22 '20 at 20:49
  • I didn't realize that only the dataframes in the list get updated - indeed, upon checking list `l` the dataframes in there are indeed updated. Is there any way to implement the code so that the original ones get updated as well? – lvnwrth Oct 22 '20 at 21:11
  • A better thing to fix would be the part where you created a bunch of data frames with indexes in their names. If you have related data, you should start with them in a list in the first place. Variables with numbers in their names is usually a bad R code smell. See this answer in particular: https://stackoverflow.com/a/24376207/2372064 – MrFlick Oct 22 '20 at 21:44

0 Answers0