83

I'm writing an R package where the R code talks to a Java application. The Java application outputs a CSV formatted string and I want the R code to be able to directly read the string and convert it into a data.frame.

smci
  • 26,085
  • 16
  • 96
  • 138
tommy chheng
  • 8,680
  • 9
  • 53
  • 69
  • Could you use the rJava package instead? – Joshua Ulrich Oct 14 '10 at 18:27
  • Maybe you could fiddle around with allowEscapes (in read.table). Just make sure the java output uses \n to break lines. – Roman Luštrik Oct 14 '10 at 18:29
  • @Joshua I am using rJava to talk to my Java program. I think it's more efficient to convert my heavy weight java objects to strings first before passing it into R. – tommy chheng Oct 14 '10 at 18:54
  • Tommy, what makes you think that manual serialization is more efficient than what Simon put into rJava? Did you benchmark any of this? – Dirk Eddelbuettel Oct 14 '10 at 19:26
  • 1
    maybe efficient is the wrong word. My input is an array of hashmap-like objects and my output is a R data.frame. I didn't see anything in rJava that lets me represent a java object as a data.frame so I format my objects into a string and then convert it into a R data.frame. any more efficient suggestions of dealing with this would be appreciated. – tommy chheng Oct 15 '10 at 20:48

6 Answers6

121

Editing a 7-year old answer: By now, this is much simpler thanks to the text= argument which has been added to read.csv() and alike:

R> data <- read.csv(text="flim,flam
+ 1.2,2.2
+ 77.1,3.14")
R> data
  flim flam
1  1.2 2.20
2 77.1 3.14
R> 

Yes, look at the help for textConnection() -- the very powerful notion in R is that essentially all readers (as e.g. read.table() and its variants) access these connection object which may be a file, or a remote URL, or a pipe coming in from another app, or ... some text as in your case.

The same trick is used for so-called here documents:

> lines <- "
+ flim,flam
+ 1.2,2.2
+ 77.1,3.14
+ "
> con <- textConnection(lines)
> data <- read.csv(con)
> close(con)
> data
  flim flam
1  1.2 2.20
2 77.1 3.14
> 

Note that this is a simple way for building something but it is also costly due to the repeated parsing of all the data. There are other ways to get from Java to R, but this should get you going quickly. Efficiency comes next...

fny
  • 24,563
  • 12
  • 85
  • 110
Dirk Eddelbuettel
  • 331,520
  • 51
  • 596
  • 675
  • 8
    More recent R versions have a simpler mechanism, see the answer by @Adam Bradley in this thread : http://stackoverflow.com/a/16349171/17523 – Boris Gorelik Nov 28 '13 at 05:02
79

Note that in now-current versions of R, you no longer need the textConnection(), it's possible to simply do this:

> states.str='"State","Abbreviation"
+ "Alabama","AL"
+ "Alaska","AK"
+ "Arizona","AZ"
+ "Arkansas","AR"
+ "California","CA"'
> read.csv(text=states.str)
       State Abbreviation
1    Alabama           AL
2     Alaska           AK
3    Arizona           AZ
4   Arkansas           AR
5 California           CA
Adam Bradley
  • 1,423
  • 13
  • 13
  • 5
    I know this itself is a little late but - it perhaps might be useful to submit this as an edit to the accepted answer, since it is unlikely the OP will now change the accepted answer, yet this now seems the better answer? – obfuscation Sep 20 '13 at 08:49
  • 1
    IMHO, the OP should unaccept the accepted answer, and accept this one... – Mischa Jul 01 '16 at 13:23
4

Yes. For example:

string <- "this,will,be\na,data,frame"
x <- read.csv(con <- textConnection(string), header=FALSE)
close(con)
#> x
#    V1   V2    V3
#1 this will    be
#2    a data frame
Joshua Ulrich
  • 163,034
  • 29
  • 321
  • 400
1

Suppose you have a file called tommy.csv (yes, imaginative, I know...) that has the contents of

col1 col2 \n 1 1 \n 2 2 \n 3 3

where each line is separated with an escape character "\n".

This file can be read with the help of allowEscapes argument in read.table.

> read.table("tommy.csv", header = TRUE, allowEscapes = TRUE)

  col1 col2
1 col1 col2
2    1    1
3    2    2
4    3    3

It's not perfect (modify column names...), but it's a start.

Roman Luštrik
  • 64,404
  • 24
  • 143
  • 187
1

Using a tidyverse approach, you can just specify a text value

library(readr)
read_csv(file = "col1, col2\nfoo, 1\nbar, 2")
# A tibble: 2 x 2
 col1   col2
 <chr>  <dbl>
1 foo       1
2 bar       2
0

This function wraps Dirk's answer into a convenient form. It's brilliant for answering questions on SO, where the asker has just dumped the data onscreen.

text_to_table <- function(text, ...)
{
   dfr <- read.table(tc <- textConnection(text), ...)
   close(tc)
   dfr
}

To use it, first copy the onscreen data and paste into your text editor.

foo bar baz
1 2 a
3 4 b

Now wrap it with text_to_table, quotes and any other arguments for read.table.

text_to_table("foo bar baz
1 2 a
3 4 b", header = TRUE)
Richie Cotton
  • 107,354
  • 40
  • 225
  • 343