How to skip second line is csv file while maintaining first line as column names with read_csv?

Question

Qualtrics generates csv files with variable names in the first line and variable labels in the second line. I'd like to use read_csv() to read in my data while reading the first line as column names and then skipping the next line of variable labels. Below is my failed attempt.

library(readr)
mydata <- read_csv("qualtrics_data.csv", col_names = TRUE, skip = 2) # this would actually skip both the names and label rows.

This helped me solve a similar problem: https://stackoverflow.com/questions/23543825/r-read-table-how-can-i-read-the-header-but-also-skip-lines — GlennFriesen, May 30 '17 at 23:08

austensen · Accepted Answer · 2017-09-26T17:26:30.993

You can just read in twice - once to get the names, and then to get the data.

library(readr)
library(dplyr)

csv_file <- "mpg,cyl,disp,hp,drat,wt
mpg,cyl,disp,hp,drat,wt
21.0,6,160,110,3.90,2.875
22.8,4,108,93,3.85,2.320
21.4,6,258,110,3.08,3.215
18.7,8,360,175,3.15,3.440
18.1,6,225,105,2.76,3.460"


df_names <- read_csv(csv_file, n_max = 0) %>% names()

df_names
#> [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"

df <- read_csv(csv_file, col_names = df_names, skip = 2)

df

#> # A tibble: 5 x 6
#>     mpg   cyl  disp    hp  drat    wt
#>   <dbl> <int> <int> <int> <dbl> <dbl>
#> 1  21.0     6   160   110  3.90 2.875
#> 2  22.8     4   108    93  3.85 2.320
#> 3  21.4     6   258   110  3.08 3.215
#> 4  18.7     8   360   175  3.15 3.440
#> 5  18.1     6   225   105  2.76 3.460

score 0 · Answer 2 · answered May 31 '17 at 01:03

0

Use read.csv eg:

df <- read.csv("example.csv")
df <- df[-1,] # -1 removes the first row, you can change to -2 to remove 2nd row...etc

answered May 31 '17 at 01:03

Jimmy

427
2
15

1

Since the second row contains labels, doing this will cause all the columns to be parsed as character variables – austensen May 31 '17 at 04:42

How to skip second line is csv file while maintaining first line as column names with read_csv?

2 Answers2