1

I want to read csv file skip three lines (except header) but include header names in data.frame. I've tried following but header names are wrong:

> sine = read.csv(file="sine.csv",head=TRUE,sep=",", skip=3, check.names=TRUE)
> colnames(sine)
 [1] "X0"     "X0.0"   "X0.0.1" "X0.0.2" "None"   "X1.0"   "X0.0.3" "None.1" "X.."   
[10] "X0.1"   "X0.2"

When I read dataset without skipping three lines header names are OK:

> sine = read.csv(file="sine.csv",head=TRUE,sep=",", skip=0, check.names=TRUE)
> colnames(sine)
 [1] "reset"                                                                                    
 [2] "angle"                                                                                    
 [3] "sine"                                                                                     
 [4] "multiStepPredictions.actual"                                                              
 [5] "multiStepPredictions.1"                                                                   
 [6] "anomalyScore"                                                                             
 [7] "multiStepBestPredictions.actual"                                                          
 [8] "multiStepBestPredictions.1"                                                               
 [9] "anomalyLabel"                                                                             
[10] "multiStepBestPredictions.multiStep.errorMetric..altMAPE..steps..1..window.1000.field.sine"
[11] "multiStepBestPredictions.multiStep.errorMetric..aae..steps..1..window.1000.field.sine"    

What I'm doing wrong?

Wakan Tanka
  • 5,906
  • 11
  • 47
  • 96

1 Answers1

3

something like this,

foo <- read.csv("http://www.ats.ucla.edu/stat/r/faq/test.csv", header=T)
foo
#    make   model mpg weight price
# 1   amc concord  22   2930  4099
# 2   amc   oacer  17   3350  4749
# 3   amc  spirit  22   2640  3799
# 4 buick century  20   3250  4816
# 5 buick electra  15   4080  7827
colnames(foo)
# [1] "make"   "model"  "mpg"    "weight" "price" 

bar <- read.csv("http://www.ats.ucla.edu/stat/r/faq/test.csv", header=T, skip=3)
bar
#     amc  spirit X22 X2640 X3799
# 1 buick century  20  3250  4816
# 2 buick electra  15  4080  7827
colnames(bar)
# [1] "amc"    "spirit" "X22"    "X2640"  "X3799" 

As Richard Scriven pointed out below my initial answer did not work, don't know how I missed that. Found this SO answer and made the solution below.

all_content = readLines("http://www.ats.ucla.edu/stat/r/faq/test.csv")
skip_second = all_content[c(c(-2:-4))]
foo2 = read.csv(textConnection(skip_second), 
                header = TRUE, stringsAsFactors = FALSE)
foo2
#    make   model mpg weight price
# 1 buick century  20   3250  4816
# 2 buick electra  15   4080  7827
colnames(foo2)
# [1] "make"   "model"  "mpg"    "weight" "price" 
Community
  • 1
  • 1
Eric Fail
  • 7,222
  • 5
  • 61
  • 118