2

R newb. Small rep of my data.

TeamHome <- c("LAL", "HOU", "SAS", "LAL")
TeamAway <- c("IND", "SAS", "LAL", "HOU")
df <- data.frame(cbind(TeamHome, TeamAway))
df

   TeamHome TeamAway
     LAL      IND
     HOU      SAS
     SAS      LAL
     LAL      HOU

Imagine these are the first four games of a season with thousands of games. For the home team and the visitor team I want to compute the cumulative number of games played at home, on the road and total. So 3 new columns for both the home team and the visiting team. I would like to get something like this (in this case I am only calculating the new variables for the HOME TEAM):

    TeamHome TeamAway HomeTeamGamesPlayedatHome HomeTeamGamesPlayedRoad HomeTeamTotalgames
1      LAL      IND                         1                       0                  1
2      HOU      SAS                         1                       0                  1
3      SAS      LAL                         1                       1                  2
4      LAL      HOU                         2                       1                  3

To compute the first column (HomeTeamGamesPlayedatHome) I managed to do it with:

df$HomeTeamGamesPlayedatHome <- ave(df$TeamHome==df$TeamHome, df$TeamHome, FUN=cumsum)

But it feels over complicated and also I can't calculate the other columns with this approach.

I also thought of using the formula table to count the number of occurrences:

 table(df$TeamHome)

but it just computes the totals and I want the result at any given point in time. thanks!

Sburg13
  • 121
  • 5

2 Answers2

2
library(dplyr)
df <- df %>% group_by(TeamHome) %>% 
  mutate(HomeGames = seq_along(TeamHome))
lst <- list()
for(i in 1:nrow(df)) lst[[i]] <- sum(df$TeamAway[1:i] == df$TeamHome[i])
df$HomeTeamGamesPlayedRoad <- unlist(lst)
df %>% mutate(HomeTeamTotalgames = HomeGames+HomeTeamGamesPlayedRoad)
  TeamHome TeamAway HomeGames HomeTeamGamesPlayedRoad HomeGames
1      LAL      IND         1                       0         1
2      HOU      SAS         1                       0         1
3      SAS      LAL         1                       1         2
4      LAL      HOU         2                       1         3

HomeGames is created with seq_along iterated by row. HomeTeamGamesPlayedRoad is created with a loop checking the values in TeamAway up to and including the current game. The final row is the sum of the other two created.

Pierre L
  • 26,748
  • 5
  • 39
  • 59
  • It worked thanks! I was expecting something less complex for it but it does the work. – Sburg13 Aug 25 '15 at 00:35
  • hi pierre. thanks so much for the help. Imagine I have an extra third column with the PTS scored by the home team and a fourth one with the PTS scored by the visiting team. How could I extend the formula to sum the cumulative points of the home team scored at home and on the road? thanks so much – Sburg13 Aug 25 '15 at 18:29
  • 1
    It's better to ask a follow-up with another question. And add this question as link for reference. – Pierre L Aug 25 '15 at 19:14
  • Thanks Pierre. I just dit that. Please find the link here: http://stackoverflow.com/questions/32212365/cumulative-mean-with-conditionals – Sburg13 Aug 25 '15 at 19:26
  • As you can see I tried to adapt your code for a slightly different calculation but could'nt make it work. I would like to use kind of the sumif function from excel – Sburg13 Aug 25 '15 at 19:28
1

A Loop Solution:

TeamHome <- c("LAL", "HOU", "SAS", "LAL")
TeamAway <- c("IND", "SAS", "LAL", "HOU")
df <- data.frame(TeamHome,TeamAway,HomeTeamGamesPlayedatHome=ave(TeamHome==TeamHome, TeamHome, FUN=cumsum))

for (i in 1:nrow(df)) {
        curdf<-df[1:i,];v<-ave(curdf$TeamAway==as.character(curdf$TeamHome[i]), curdf$TeamAway, FUN=cumsum)
        df$HomeTeamGamesPlayedRoad[i] <- sum(v)
}
df$HomeTeamTotalgames <- df$HomeTeamGamesPlayedatHome + df$HomeTeamGamesPlayedRoad

      TeamHome TeamAway HomeTeamGamesPlayedatHome HomeTeamGamesPlayedRoad HomeTeamTotalgames
1      LAL      IND                         1                       0                  1
2      HOU      SAS                         1                       0                  1
3      SAS      LAL                         1                       1                  2
4      LAL      HOU                         2                       1                  3
user2673238
  • 2,178
  • 16
  • 22