0

Somewhat beginner working on a project that surpasses my expertise.
I am working with data that includes test scores (Scores) for 100 students in each high school grade level (Grade)for the past 10 years (Year).

I want to express the mean scores for each grade for each year. Example: 91.2 for Grade 9 in 2018, 89.3 for Grade 10 in 2018, 78.8 for Grade 9 in 2017, etc. Long range (worry about this later) is that I would like to plot the mean scores by year for each grade using facet. But also to put them all together on a chart.

I typically use dplyr for most of my group by and select functions. I just can't figure out how to select the multiple variables (year and grade) then pipe in the mean function for just those variables. I am starting to suspect that this will me a multi step process which exhausts my ability.

Thanks in advance

Screenshot of example data: enter image description here

Phil
  • 4,424
  • 3
  • 22
  • 55
  • Welcome to StackOverflow. Aggregations in R have been asked and answered many times on SO. Please read: [How much research effort is expected of Stack Overflow users?](https://meta.stackoverflow.com/q/261592/1422451) – Parfait Mar 05 '20 at 23:59

1 Answers1

0

It would be something like

library(tidyverse)

# Creating random data, you don't need to do this
mydf <- expand_grid(Year = 2017:2019, Grade = 9:12) %>% 
  mutate(Score = rnorm(12, mean = 80, 10))

mydf_summ <- mydf %>%
  group_by(Year, Grade) %>%
  summarize(Score = mean(Score)) %>%
  ungroup()

ggplot(mydf_summ, aes(x = Year, y = Score, group = Grade)) +
  geom_point() +
  geom_line() + 
  facet_wrap(vars(Grade)) +
  scale_x_continuous(breaks = 2017:2019)

enter image description here

Phil
  • 4,424
  • 3
  • 22
  • 55