Issue with number values importing csv files in R

Question

As usually, I am importing a .csv file from Excel. Since I will be performing some econometric regressions, I m not importing just the values, but also some columns with labels.

df <- read.csv("peasantsworkalot.csv", header=TRUE)

where the df looks like the following

country <- c("AT", "AT", "AT", "AT")
code <- c("AT1", "AT1", "AT2", "AT2")
c <- c("Village1", "Village1", "Village2", "Village2")
d <- c("Year1", "Year1", "Year2", "Year2")
e <- c(65322.09, 62322.01, 84561.06, 86000.02)
df <- cbind(country,code,c,d,e)
df

[1,] "AT" "AT1" "Village1" "Year1" "65322.09"
[2,] "AT" "AT1" "Village1" "Year1" "62322.01"
[3,] "AT" "AT2" "Village2" "Year2" "84561.06"
[4,] "AT" "AT2" "Village2" "Year2" "86000.02"

Whenever I try to make any kind of operation with the values in the e column, I got the following message:

[1] NA
Warning message:
In Ops.factor( ):
  + not meaningful for factors

I suppose that, for somewhat reason it reads the values as non numeric. Therefore I tried

as.numeric(df)

or

as.numeric(df[,5])

The first does not work and gives

Error: (list) object cannot be coerced to type 'double'

The second works but it changes the values. For instance 65322.09 becomes 259 , I don't know for whatever reason. First time this happens and not for any .csv files. Some just work fine.

I don't mean to be rude, but there are a lot of things you are misunderstanding. Your object `df` is not a `data.frame` but a character matrix. The `cbind` step coereced your numeric values into factors. `as.numeric(df)` doesn't work because you can't turn a data.frame into a number, it's like saying "Please turn McDonalds into a healthier hamburger". `as.numeric` on a factor (as per your last step) reveals the underlying integer code for the factors. Run `?factor` for more info on how those work — Señor O, Mar 27 '14 at 17:47
Relevant question: http://stackoverflow.com/questions/3418128/how-to-convert-a-factor-to-an-integer-numeric-without-a-loss-of-information — Señor O, Mar 27 '14 at 17:48
You MUST be rude. As long as ýou add relevant details, it is just an addition to what I know. Since all of that was useful, I thank you. (by the way, I just imported the .csv, I did not use the cbind command. In any case, I did not know about the cbind implications) — Bob, Mar 28 '14 at 08:08

score 2 · Answer 1 · answered Mar 27 '14 at 17:52

2

In your read.csv function include this read.csv("readThis.csv", stringsAsFactors=FALSE). Also read the information in the comments. You definitely should work up your knowledge stat.

answered Mar 27 '14 at 17:52

stanekam

3,511
2
17
32

With this I get ´class mode = character´. However, my mistake was much more basic, since I had some ´...´ in place of NA. I solved it adding ´na.string="..."´ when using ´read.csv´. – Bob Mar 28 '14 at 09:04

score 1 · Answer 2 · answered Mar 28 '14 at 14:59

1

To convert a column to numeric you can run:

df[,5] <- as.numeric(df[,5])

However, if that column is a factor, it will lead to undesired results (see help("factor")). So if it's a factor column, the most straightforward approach is to convert it to character first, then to numeric:

df[,5] <- as.numeric(as.character(df[,5]))

answered Mar 28 '14 at 14:59

Señor O

15,939
2
41
43

Thank you. However, in my case it was a simple problem of importing a .csv with missing values (NA). My fault, of course – Bob Mar 31 '14 at 08:11

score 0 · Accepted Answer · answered Mar 31 '14 at 08:10

0

If the .csv file contains NA, as for instance in the form ..., the read.csv function must include read.csv("readThis.csv", na.string="..."). This will preserve the numeric values in the .csv file. Otherwise, they will be switched to non numeric.

answered Mar 31 '14 at 08:10

Bob

421
3
13

Issue with number values importing csv files in R

3 Answers3