I have imported SAS data into R, using rio:
library(rio)
r<-import("S:/MyFolder/MyData.sas7bdat", catalog_file = "S:/MyFolder/formatsforr.sas7bcat")
That gives my r, which has column r$Race (storing atomic values 1,2,3, and 4), which has an attribute that seems to store my Race format (0="All", 1="White", ..., 8="Unknown", ...). I want to convert r$Race to a factor, using the attributes. I want to do this for many columns.
If I had used haven, I could have done this:
library(haven)
h <- read_sas("S:/MyFolder/MyData.sas7bdat", "S:/MyFolder/formatsforr.sas7bcat")
h$Race <- as_factor(h$Race) # as_factor is a haven function that converts the column to a factor, using the format to label the factor values.
But as_factor fails with r (with the object that rio created).
I'm hoping to just use rio, which we like more that haven for other reasons. I am trying to create simple examples for other coders in my health department to use. I would like to minimize the number of packages they need to learn and load, so we can focus our learning.
In case this helps:
as_factor(r$Race)
returns this:
Error in UseMethod("as_factor") : no applicable method for 'as_factor' applied to an object of class "c('double', 'numeric')"
str(r$Race)
returns:
atomic [1:7776] 1 2 3 4 1 2 3 4 1 2 ... - attr(, "label")= chr "Race (1=W,2=B,3=NatAm/AKNat,4=Asian/PacIs)" - attr(, "format.sas")= chr "POPEST199XRACE" - attr(, "labels")= Named num [1:10] 0 1 2 3 4 0 1 2 3 4 ..- attr(, "names")= chr [1:10] "Total" "White" "Black" "American Indian or Alaska Native" ...
str(h$Race)
returns:
Class 'labelled' atomic [1:7776] 1 2 3 4 1 2 3 4 1 2 ... ..- attr(, "label")= chr "Race (1=W,2=B,3=NatAm/AKNat,4=Asian/PacIs)" ..- attr(, "format.sas")= chr "POPEST199XRACE" ..- attr(, "labels")= Named num [1:10] 0 1 2 3 4 0 1 2 3 4 .. ..- attr(, "names")= chr [1:10] "Total" "White" "Black" "American Indian or Alaska Native" ...
Right after running the import(...), running:
dput(head(r$Race))# returns this:
c(1, 2, 3, 4, 1, 2)
Right after running the read_sas(...), running:
dput(head(h$Race)) # returns this:
structure( c(1, 2, 3, 4, 1, 2),
labels = structure(c(0, 1, 2, 3, 4, 0, 1, 2, 3, 4),
.Names = c("Total", "White", "Black", "American Indian or Alaska Native",
"Asian or Pacific Islander", "Total", "White", "Black",
"American Indian or Alaska Native", "Asian or Pacific Islander")),
class = "labelled")
I did not find an online .sas7bcat file. I could post SAS code to create it and the data file.