Questions tagged [r-factor]

The factor is a data type in the R language, used to encode categorical or enumerated data.

The factor is a data type in the R language, used to encode categorical or enumerated data. This data type is often used in statistical models.

This type is encoded as an integer value, along with a lookup table of factor levels. The factor levels are represented as a vector of character strings. This representation allows easy conversion to character, and efficient use in statistical computations.

437 questions
570
votes
15 answers

Drop unused factor levels in a subsetted data frame

I have a data frame containing a factor. When I create a subset of this dataframe using subset or another indexing function, a new data frame is created. However, the factor variable retains all of its original levels, even when/if they do not…
medriscoll
  • 24,637
  • 16
  • 36
  • 36
112
votes
2 answers

Confusion between factor levels and factor labels

There seems to be a difference between levels and labels of a factor in R. Up to now, I always thought that levels were the 'real' name of factor levels, and labels were the names used for output (such as tables and plots). Obviously, this is not…
donodarazao
  • 2,563
  • 3
  • 22
  • 26
96
votes
7 answers

Factors in R: more than an annoyance?

One of the basic data types in R is factors. In my experience factors are basically a pain and I never use them. I always convert to characters. I feel oddly like I'm missing something. Are there some important examples of functions that use…
JD Long
  • 55,115
  • 51
  • 188
  • 278
72
votes
8 answers

Imported a csv-dataset to R but the values becomes factors

I am very new to R and I am having trouble accessing a dataset I've imported. I'm using RStudio and used the Import Dataset function when importing my csv-file and pasted the line from the console-window to the source-window. The code looks as…
Joe
  • 943
  • 2
  • 7
  • 6
72
votes
10 answers

Coerce multiple columns to factors at once

I have a sample data frame like below: data <- data.frame(matrix(sample(1:40), 4, 10, dimnames = list(1:4, LETTERS[1:10]))) I want to know how can I select multiple columns and convert them together to factors. I usually do it in the way like…
wsda
  • 1,035
  • 1
  • 11
  • 16
71
votes
1 answer

Why use as.factor() instead of just factor()

I recently saw Matt Dowle write some code with as.factor(), specifically for (col in names_factors) set(dt, j=col, value=as.factor(dt[[col]])) in a comment to this answer. I used this snippet, but I needed to explicitly set the factor levels to…
Ben
  • 15,465
  • 26
  • 90
  • 157
66
votes
7 answers

Unseen factor levels when appending new records with unseen string values to a dataframe, cause Warning and result in NA

I have a dataframe (14.5K rows by 15 columns) containing billing data from 2001 to 2007. I append new 2008 data to it with: alltime <- rbind(alltime,all2008) Unfortunately that generates a warning: > Warning message: In `[<-.factor`(`*tmp*`, ri,…
Farrel
  • 9,584
  • 19
  • 57
  • 95
63
votes
3 answers

Plotting with ggplot2: "Error: Discrete value supplied to continuous scale" on categorical y-axis

The plotting code below gives Error: Discrete value supplied to continuous scale What's wrong with this code? It works fine until I try to change the scale so the error is there... I tried to figure out solutions from similar problem but…
Rechlay
  • 1,551
  • 1
  • 11
  • 17
61
votes
10 answers

Cleaning up factor levels (collapsing multiple levels/labels)

What is the most effective (ie efficient / appropriate) way to clean up a factor containing multiple levels that need to be collapsed? That is, how to combine two or more factor levels into one. Here's an example where the two levels "Yes" and "Y"…
Ricardo Saporta
  • 51,025
  • 13
  • 129
  • 166
46
votes
8 answers

How to concatenate factors, without them being converted to integer level?

I was surprised to see that R will coerce factors into a number when concatenating vectors. This happens even when the levels are the same. For example: > facs <- as.factor(c("i", "want", "to", "be", "a", "factor", "not", "an", "integer")) >…
Keith
  • 2,686
  • 4
  • 25
  • 38
41
votes
2 answers

How do I convert certain columns of a data frame to become factors?

Possible Duplicate: identifying or coding unique factors using R I'm having some trouble with R. I have a data set similar to the following, but much longer. A B Pulse 1 2 23 2 2 24 2 2 12 2 3 25 1 1 65 1 3 45 Basically, the first 2 columns are…
math11
  • 507
  • 2
  • 6
  • 8
40
votes
4 answers

Concatenate rows of a data frame

I would like to take a data frame with characters and numbers, and concatenate all of the elements of the each row into a single string, which would be stored as a single element in a vector. As an example, I make a data frame of letters and…
Sam
  • 665
  • 2
  • 6
  • 9
39
votes
6 answers

Colouring plot by factor in R

I am making a scatter plot of two variables and would like to colour the points by a factor variable. Here is some reproducible code: data <- iris plot(data$Sepal.Length, data$Sepal.Width, col=data$Species) This is all well and good but how do I…
LoveMeow
  • 991
  • 1
  • 11
  • 24
37
votes
3 answers

Sort a factor based on value in one or more other columns

I've looked through a number of posts about ordering factors, but haven't quite found a match for my problem. Unfortunately, my knowledge of R is still pretty rudimentary. I have a subset of an archaeological artifact catalog that I'm working with.…
Scard
  • 733
  • 2
  • 7
  • 14
30
votes
1 answer

Converting a factor to numeric without losing information R (as.numeric() doesn't seem to work)

Possible Duplicate: R - How to convert a factor to an integer\numeric in R without a loss of information The following fact about the as.numeric() function has been brought to my attention > blah<-c("4","8","10","15") > blah [1] "4" "8" "10"…
Michael
  • 1,326
  • 4
  • 18
  • 38
1
2 3
29 30