
I am trying to create some descriptive statistics and histograms out of ordered variables (range 0 to 10). I used the following commands:


But R starts from 1 and counts the "refusal" values as a further numeric value.

How can I let R start from 0 and ignore the "refusal" values?

Edit: I was able to let R ignore "refusal" value using the following command:

is.na (data$var1[data$var1=="Refusal"]) <- TRUE

But when I search for possible solution about the 0 values I am only finding suggestion on how to ignore/remove 0 values...

Edit2: This is a sample of my data,

 [1] 5       8       8       8       Refusal 10      8       Refusal 7      
  [10] 7       8       7       8       8       8       8       8       8      
  [19] 8       0       9       Refusal 6       10      7       7       9

as you can see the range is from 0 to 10 but using the R library "psych" and the command "describe" the output range is always 1 to 11 and this invalidates the whole statistics.

> class(data$var1)
[1] "factor"
> describe(as.numeric(data$var1), na.rm=TRUE)
  vars    n mean   sd median trimmed  mad min max range  skew kurtosis   se
1    1 1115 8.38 1.94      9    8.57 1.48   1  11    10 -1.06     1.42 0.06

Have a look at how factors work, with ?factor, or looking at the example question here. In essence, each level is given a number starting at 1, hence ending at 11 if you have 11 unique values. Conversion of a factor to numeric returns these codes, rather than the underlying numbers they relate to. To do this, first convert to character, then to numeric. See the difference between these code snippets:

#create data
a <- factor(sample(c(0:10,"refusal"),50,T)) #Some dummy data
# [1] "factor"

snippet 1 - how you're doing it

#n missing  unique    Mean     .05     .10     .25     .50     .75     .90     .95 
#50       0      11    6.28    2.00    2.00    4.00    6.00    8.75   10.00   11.00 
#1  2  3 4 5  6  7  8 9 10 11
#Frequency 2  5  5 4 2  8  6  5 3  6  4
#%         4 10 10 8 4 16 12 10 6 12  8

snippet 2 - correct way

#n missing  unique    Mean     .05     .10     .25     .50     .75     .90     .95 
#46       4      10   5.304     1.0     1.0     3.0     5.0     8.0     9.5    10.0 
#0  1 2 3  4  5  7 8  9 10
#Frequency 2  5 4 2  8  6  5 3  6  5
#%         4 11 9 4 17 13 11 7 13 11
#Warning message:
#  In describe(as.numeric(as.character(a)), na.rm = TRUE) :
#  NAs introduced by coercion

Note the difference in range (even if my describe function isn't the same as yours). The warning refers to the "refusals which are converted to NAs as they don't represent a number

