0

I have read excel files that has two sheets in R, where first sheet has four column and second sheet has only one column. after I read first sheet in R, but still not well structured. I want to get rid of unwanted column.

When I read first sheet from excel files in R, it looks like this :

> head(data)
    User ID   Group       Week  Spend Purchases     
1 173366631    Test 2014-10-06 546.87         4 <NA>
2 144427921    Test 2014-10-06 218.09         3 <NA>
3 213641575    Test 2014-10-06  18.75         1 <NA>
4 614549153 Control 2014-10-06  29.98        15 <NA>
5  84652272    Test 2014-10-06 628.16         4 <NA>
6  75292137    Test 2014-10-06   8.46         1 <NA>
  structure(c(NA_character_, NA_character_, NA_character_, NA_character_, 
1                                                                     <NA>
2                                                                     <NA>
3                                                                     <NA>
4                                                                     <NA>
5                                                                     <NA>
6                                                                     <NA>

How can I get rid of the column after "Purchases" column? how can I only keep five column(a.k.a, User ID, Group, Week, Spend, Purchases) on my data?

amonk
  • 1,651
  • 2
  • 15
  • 25
  • Please note that the duplicate describes how to drop (or keep) columns both using _integer indexing_, and by using a _vector of column names_. – Henrik Feb 28 '16 at 15:53
  • This will remove columns in data.frame `DF` that are all NA: `DF[ ! apply(is.na(DF), 2, all) ]` – G. Grothendieck Feb 28 '16 at 15:58

2 Answers2

0

We can use the [ to subset the columns

data <- data[1:5]

If this needs to be specific with "Purchases"

data[seq(grep("Purchases", colnames(data)))]
akrun
  • 674,427
  • 24
  • 381
  • 486
  • 1
    your answer was helpful to solve my problem. I will pay attention later. Thank you for your reminding. –  Feb 28 '16 at 19:30
0
lines <- ('User_ID  Group Week Spend  Purchases
173366631 Test 2014-10-06 546.87 4 <NA>
144427921 Test 2014-10-06 218.09 3 <NA>
213641575 Test 2014-10-06 18.75 1 <NA>
614549153 Control 2014-10-06 29.98 15 <NA>
84652272 Test 2014-10-06 628.16 4 <NA>
75292137 Test 2014-10-06 8.46 1 <NA>')

df <- read.table(text = lines, header = TRUE)

df[, -which(names(df) == 'Purchases')]
tagoma
  • 3,393
  • 4
  • 33
  • 53