0

I have a vector like this

 [1] "72.82947"  NA          NA          NA          NA          NA          "66.00949"  NA         
  [9] NA          "0.133434"  NA          NA          "2.265083"  NA          NA          NA         
 [17] " 0"        NA          NA          NA          NA          NA          NA          NA         
 [25] "0.311346"  NA          NA          " 0"        NA          NA          NA          NA         
 [33] NA          NA          NA          NA          NA          "0.7024582" NA          NA         
 [41] NA          NA          NA          NA          NA          NA          "3.543211"  NA         
 [49] NA          "5.779669"  NA          "4.617021"  NA          "1.682751"  NA          NA         
 [57] NA          NA          NA          "0.214977"  NA          NA          NA          "1.573152" 

Following many previous questions (How to remove all the NA from a Vector?, R script - removing NA values from a vector, R: removing NAs in numerical vectors ) and manuals I used

vector.test[!is.na(exo.1.4.mad)]

and

vector.test[na.omit(exo.1.4.mad)]

But none of them works. I always get back the same vector with NA. Then I tried to subset the vector manually, indicating the position where I have values and I tried to convert it in numeric values:

as.numeric(as.character(exo.1.4.mad.values))

But also this does not work, and NAs are introduced by coercion. At this point I think I'm missing something concerning the formatting/class of my original vector.

Any suggestion?


I add some more information for my object:

typeof(exo.1.4.mad) 1 "integer"

dput(exo.1.4.mad) structure(c(33L, 37L, 37L, 37L, 37L, 37L, 31L, 37L, 37L, 4L, 37L, 37L, 20L, 37L, 37L, 37L, 1L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 8L, 37L, 37L, 1L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 11L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 24L, 37L, 37L, 29L, 37L, 26L, 37L, 19L, 37L, 37L, 37L, 37L, 37L, 6L, 37L, 37L, 37L, 18L, 37L, 2L, 37L, 1L, 37L, 14L, 37L, 25L, 37L, 27L, 37L, 10L, 37L, 3L, 37L, 37L, 35L, 37L, 37L, 28L, 37L, 37L, 37L, 32L, 37L, 12L, 37L, 30L, 37L, 37L, 37L, 37L, 37L, 36L, 37L, 37L, 7L, 37L, 13L, 37L, 37L, 37L, 37L, 9L, 37L, 37L, 37L, 21L, 37L, 37L, 37L, 37L, 37L, 37L, 15L, 37L, 37L, 37L, 34L, 37L, 23L, 37L, 37L, 37L, 37L, 37L, 22L, 37L, 37L, 37L, 16L, 37L, 37L, 17L, 37L, 5L, 37L), .Label = c("\" 0\"", "\"0.044478\"", "\"0.1103672\"", "\"0.133434\"", "\"0.1893487\"", "\"0.214977\"", "\"0.2506812\"", "\"0.311346\"", "\"0.3219932\"", "\"0.409485\"", "\"0.7024582\"", "\"0.7029872\"", "\"0.7983231\"", "\"1.104537\"", "\"1.170474\"", "\"1.2355\"", "\"1.255681\"", "\"1.573152\"", "\"1.682751\"", "\"2.265083\"", "\"2.491765\"", "\"2.566038\"", "\"2.731105\"", "\"3.543211\"", "\"4.42271\"", "\"4.617021\"", "\"5.235322\"", "\"5.340412\"", "\"5.779669\"", "\"5.847934\"", "\"66.00949\"", "\"67.9525\"", "\"72.82947\"", "\"75.2123\"", "\"8.347973\"", "\"9.832462\"", "NA"), class = "factor")

this confuses me even more!

Community
  • 1
  • 1
efrem
  • 433
  • 6
  • 22
  • http://stackoverflow.com/questions/4862178/remove-rows-with-nas-in-data-frame This link might help in removing NAs. after removing NAs, `as.numeric` should help. – Manoj G Jul 02 '14 at 09:34
  • 1
    I've tried your code on your vector and it works fine. Are you sure you have a vector there? Can you please provide `str(vector.test)` ? – David Arenburg Jul 02 '14 at 09:42
  • `na.omit` might not work here. but your `is.na` command should have worked. You could try this as well... `as.numeric(vect.test[complete.cases(vect.test)])` – Manoj G Jul 02 '14 at 09:44
  • @DavidArenburg, you got it right! `> str(exo.1.4.mad) Factor w/ 37 levels "\" 0\"","\"0.044478\"",..: 33 37 37 37 37 37 31 37 37 4 ...` – efrem Jul 02 '14 at 09:46
  • @ManojG, thank you this is the result of your code: `[1] 33 37 37 37 37 37 31 37 37 4 37 37 20 37 37 37 1 37 37 37 37 37 37 37 8 37 37 1 37 37 37 37 37 [34] 37 37 37 37 11 37 37 37 37 37 37 37 37 24 37 37 29 37 26 37 19 37 37 37 37 37 6 37 37 37 18 37 2 [67] 37 1 37 14 37 25 37 27 37 10 37 3 37 37 35 37 37 28 37 37 37 32 37 12 37 30 37 37 37 37 37 36 37` – efrem Jul 02 '14 at 09:48
  • Can you provide `dput(exo.1.4.mad)` ? Put it into the question itself – David Arenburg Jul 02 '14 at 09:49

3 Answers3

2

Try:

exo1 <- as.numeric(gsub("[^.0-9]+","",exo.1.4.mad))
exo1[!is.na(exo1)]
 #[1] 72.8294700 66.0094900  0.1334340  2.2650830  0.0000000  0.3113460
 #[7]  0.0000000  0.7024582  3.5432110  5.7796690  4.6170210  1.6827510
 #[13]  0.2149770  1.5731520  0.0444780  0.0000000  1.1045370  4.4227100
 #[19]  5.2353220  0.4094850  0.1103672  8.3479730  5.3404120 67.9525000
 #[25]  0.7029872  5.8479340  9.8324620  0.2506812  0.7983231  0.3219932
 #[31]  2.4917650  1.1704740 75.2123000  2.7311050  2.5660380  1.2355000
 #[37]  1.2556810  0.1893487

Explanation

 [^.0-9]+ ## select everything else other than digits and dot and remove it.
Community
  • 1
  • 1
akrun
  • 674,427
  • 24
  • 381
  • 486
  • Thanks! I combine your answer with the one from @David Arenburg, and it did the job!Can you help me to understand that? – efrem Jul 02 '14 at 10:22
  • `efrem` just edited the code. Also, please check the regex tutorials. – akrun Jul 02 '14 at 10:26
1

Here is something that works for me :

> myVec <- c(NA, "1", "2", NA)
> myVec
[1] NA  "1" "2" NA 
> as.numeric(myVec[!is.na(myVec)])
[1] 1 2

Does this help you ?

Julien D.
  • 58
  • 1
  • 7
  • Why not just `myVec[!is.na(myVec)]`? He only asked to remove `NA`s, not to transform them into `numeric` class. He also stated that this code didn't work for him – David Arenburg Jul 02 '14 at 09:36
  • Actually he does (in the title) and I thought that he didn't do the whole "comman". My bad if the answer was not helpful. – Julien D. Jul 02 '14 at 09:43
  • If the original vector is a `factor`, then I guess you would need to do something like `as.numeric(as.character(myVec[!is.na(myVec)]))`. In your example, `myVec` is a character vector. – talat Jul 02 '14 at 09:53
  • 1
    @JulienD. Thanks for your answer. As I wrote I followed a similar procedure. Here the result of your suggestion: `as.numeric(exo.1.4.mad[!is.na(exo.1.4.mad)]) [1] 33 37 37 37 37 37 31 37 37 4 37 37 20 37 37 37 1 37 37 37 37 37 37 37 8 37 37 1 37 37 37 37 37 [34] 37 37 37 37 11 37 37 37 37 37 37 37 37 24 37 37 29 37 26 37 19 37 37 37 37 37 6 37 37 37 18 37 2 [67] 37 1 37 14 37 25 37 27 37 10 37 3 37 37 35 37 37 28 37 37 37 32 37 12 37 30 37 37 37 37 37 36 37 [100] 37 7 37 13 37 37 37 37 9 37 37 37 21 37 37 37 37 37 37 15 37 37 37 34 37 23 37 37 37 37 37 22 37` – efrem Jul 02 '14 at 09:55
1

The problem with your data is that your "NA"s are not realy NAs as R defines them, but just characters. Thus is.na won't work here. Simply do

exo.1.4.mad[exo.1.4.mad != "NA"]
David Arenburg
  • 87,271
  • 15
  • 123
  • 181
  • You are right! That means that also the values are considered characters. Why are they converted in NA when I apply `as.numeric`?`exo.1.4.mad[exo.1.4.mad != "NA"]->exo.1.4.mad.val` `class(exo.1.4.mad.values) [1] "factor"` – efrem Jul 02 '14 at 10:10
  • 1
    Yes, actually your object is a quite complicated situation... I can't seem to convert it to numeric either. It contains back slashes, but I cant seem to remove them with `gsub` for some reason – David Arenburg Jul 02 '14 at 10:13