8

I have json files with data for countries. One of the files has the following data:

"[{\"count\":1,\"subject\":{\"name\":\"Namibia\",\"alpha2\":\"NA\"}}]"

I have the following code convert the json into a data.frame using the jsonlite package:

df = as.data.frame(fromJSON(jsonfile), flatten=TRUE)) 

I was expecting a data.frame with numbers and strings:

count subject.name subject.alpha2
1      Namibia             "NA"

Instead, the NA alpha2 code is being automatically converted into NA logical, and this is what I get:

str(df)
$ count         : int 1
$ subject.name  : chr "Namibia"
$ subject.alpha2: logi NA

I want alpha2 to be a string, not logical. How do I fix this?

vagabond
  • 2,938
  • 3
  • 35
  • 69
Armin
  • 307
  • 1
  • 9
  • 1
    Welcome to SO. good first question. Try adding some more sample data which people can play with. – vagabond May 05 '15 at 00:46
  • 1
    Just coerce to `character`. It's probably not necessary to do so because R will do that coercion at the first need. – IRTFM May 05 '15 at 00:54
  • @BondedDust Thanks. Yeah, R does the coercion at first need, but there are some files with just data for Namibia. Is there a way to coerce to `character` when converting the json to `dataframe`? – Armin May 05 '15 at 16:38

2 Answers2

1

That particular implementation of fromJSON (and there are three different packages with that name for a function) has a simplifyVector argument which appears to prevent the corecion:

 require(jsonlite)

> as.data.frame( fromJSON(test, simplifyVector=FALSE ) )
  count subject.name subject.alpha2
1     1      Namibia             NA
> str( as.data.frame( fromJSON(test, simplifyVector=FALSE ) ) )
'data.frame':   1 obs. of  3 variables:
 $ count         : int 1
 $ subject.name  : Factor w/ 1 level "Namibia": 1
 $ subject.alpha2: Factor w/ 1 level "NA": 1
> str( as.data.frame( fromJSON(test, simplifyVector=FALSE ) ,stringsAsFactors=FALSE) )
'data.frame':   1 obs. of  3 variables:
 $ count         : int 1
 $ subject.name  : chr "Namibia"
 $ subject.alpha2: chr "NA"

I tried seeing if that option worked well with the flatten argument, but was disappointed:

> str(  fromJSON(test, simplifyVector=FALSE, flatten=TRUE) )
List of 1
 $ :List of 2
  ..$ count  : int 1
  ..$ subject:List of 2
  .. ..$ name  : chr "Namibia"
  .. ..$ alpha2: chr "NA"
IRTFM
  • 240,863
  • 19
  • 328
  • 451
0

The accepted answer did not solve my use case. However, rjson::fromJSON does this naturally, and to my surprise, 10 times faster on my data.

Elad663
  • 704
  • 1
  • 5
  • 13