2

I'm using:

R version 3.0.0 (2013-04-03) -- "Masked Marvel"
Platform: x86_64-pc-linux-gnu (64-bit)

I try to use read.csv to input a little CSV data snippet + header, directly from the terminal.

I'm encountering a problem that may be related to R skips lines from /dev/stdin and read.csv, header on first line, skip second line but is different enough (the answers there don't explain what I see here) to warrant a separate question.

R seems to skip the header line and treat the second (data) line as header:

R> d <- read.csv(file='/dev/stdin', header=TRUE) 
a,b
1,2
3,4
# hit CTRL-D twice here to end the input
# (this is also unexpected:
#  when reading a few lines interactively in bash, one CTRL-D suffices.
#  Why is doing it twice necessary in R?)

R> d
  X1 X2
1  3  4

R> colnames(d)
[1] "X1" "X2"

I found a workaround: since by default read.csv has blank.lines.skip = TRUE, I prefix the input with some blank lines. 5 empty lines before starting the input, seem to be the minimum required to get this to work as expected. BTW: a single line with 5 spaces works just as well, hinting at some 5 byte (or more) required whitespace padding:

R> d <- read.csv(file='/dev/stdin', header=TRUE)





a,b
1,2
3,4
# Enter CTRL-D twice here to mark the end of terminal input

R> d
  a b
1 1 2
2 3 4

R> colnames(d)
[1] "a" "b"

Questions:

  • Why isn't the 1st example working as expected?
  • Why are 5 blank lines or spaces needed (even 4 aren't enough) to make it work?
  • Is there a better way to reading a short csv snippet directly from the terminal? (I know about scan and readLines, but my data is in csv format already, so I want to make it as simple to read/parse/assign as possible)
Community
  • 1
  • 1
arielf
  • 5,429
  • 1
  • 30
  • 45
  • I think the answer in the first link you posted may actually be applicable. R appears to create a 4 byte buffer on `/dev/stdin`. Also, as mentioned in the comment, you can use `stdin` instead, and it appears to work fine. (Although I still don't get why you have to hit Ctrl+D twice). – nograpes May 11 '13 at 05:28
  • Thanks @nograpes! Can you write an short answer with a working example? I'll gladly accept it. The 1st link shows a 4 KB buffer is being "eaten" while in this case only 5-bytes are needed, so it seemed these are 2 different issues. Also: this example is much more minimalistic and so may be more useful. – arielf May 11 '13 at 16:09

1 Answers1

6

I think the answer in the first link you posted may actually be applicable. R appears to create a 4 byte buffer on /dev/stdin. Also, as mentioned in the comment, you can use stdin instead, and it appears to work fine. (Although I still don't get why you have to hit Ctrl+D twice).

d <- read.csv(file='stdin', header=TRUE)
a,b
1,2
3,4
# Hit Control+D twice.
> d
  a b
1 1 2
2 3 4
nograpes
  • 17,804
  • 1
  • 39
  • 62
  • Thanks so much. This workaround works. I just wonder if 'stdin' behaving differently than '/dev/stdin' isn't a bug in R io subsystem. I have never seen /dev/stdin' misbehaving line this using other programs. Also the required double CTRL-D worries me a bit. – arielf May 14 '13 at 22:20
  • 1
    @arielf Yeah, I suspect the same thing... but the R dev folks are pretty touchy when it comes to reporting bugs in `base`. So you better be sure you have done the research before reporting this. – nograpes May 15 '13 at 17:03