-1

I have a dataset which contains a participant number, a response variable, and various other unimportant variables.

I want to calculate the proportion of answer choices on the response variabble for each participant individually and save this proportion to a new dataframe. The new dataframe should contain the pp_num and the proportion number of each answer choice.

Made Up Test Data:

Response <- c("Disgust", "Sadness", "Disgust", "Anger", "Anger", "Neutral", "Anger", "Disgust", "Happiness") #create example data
ResponseNum <- c(1,2,1,3,3,4,3,1,5) #Response, but expressed in Numbers
ppnum <- c(1,1,1,2,2,2,3,3,3)
df2a_anger <- as.data.frame(cbind(Response, ResponseNum, ppnum)) #create dataframe
df2a_anger$ResponseNum <- as.numeric(as.character(df2a_anger$ResponseNum)) # make numeric

Now, I know how to calculate the total proportion across participants:

df3 <- as.data.frame(prop.table(table(df2a_anger$ResponseNum)))


But I canĀ“t get the same data divided per participant with pp_num saved. Does anyone have any ideas? My latest attempt was:

df3 <- group_by(df2a_anger, df2a_anger$ppnum) %>% prop.table(table(df2a_anger$ResponseNum))
Max
  • 91
  • 8

1 Answers1

2

You can count and then calculate proportions for each ppnum :

library(dplyr)

df2a_anger %>%
    count(ppnum, ResponseNum) %>%
    group_by(ppnum) %>%
    mutate(n = n/sum(n))

#  ppnum ResponseNum     n
#  <chr>       <dbl> <dbl>
#1 1               1 0.667
#2 1               2 0.333
#3 2               3 0.667
#4 2               4 0.333
#5 3               1 0.333
#6 3               3 0.333
#7 3               5 0.333
Ronak Shah
  • 286,338
  • 16
  • 97
  • 143