2

My data has 65 columns, but I want to sort data frame based on team_id and then I need to calculate the number of consecutive wins or losses from the team_outcome column based upon team_id.

Example, a team with 3 wins in a row would show 1,2,3 down the column. If the team then went on a 3 game losing streak, the next 3 rows would be -1,-2,-3 etc. How can I do this? This my data:

>team_id<-c("Minnesota", "Dallas", "Minnesota", "Chicago", "Brooklyn", "Cleveland", "Washington","Minnesota", "Dallas")
>team_outcome<-c("win","loss","loss","win","win","loss","win","loss","win")
  • 5
    Please provide [reproducible-example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), and expected output. What have you tried so far? – zx8754 May 15 '18 at 15:15
  • 3
    This sounds like two questions, (1) sorting and (2) consecutive win loss. For (1), I'd direct you to the R-FAQ on [How to sort a data frame?](https://stackoverflow.com/q/1296646/903061) For (2), as commented above, a reproducible example will make this much more answerable. – Gregor Thomas May 15 '18 at 15:21

1 Answers1

0

You can first create a function that returns 1,2,3... or -1,-2,-3... depending on the streak:

calc_streak <- function (x) {
    if (all(x)) return (1:length(x))
    if (all(!x)) return (-1:-length(x))
    breaks <- which(x[-1]!=x[-length(x)])+1
    streaks <- split(x, findInterval(1:length(x), breaks)) # split outcomes into streaks
    unlist(lapply(streaks, calc_streak))
}

Finally, you can apply the function to different teams in your data:

library(dplyr); library(magrittr)
df <- group_by(df, team_id) %>% 
          do(data.frame(., streak=calc_streak(.$team_outcome=="win")))
LucyMLi
  • 649
  • 4
  • 14