I want to remove duplicates and preserve the one where the year variable is maximum. My data looks like the following:
id name year position
1 Jane 1990 Sales
1 Jane 1991 Sales
1 Jane 1992 Sales
1 Jane 1993 Boss
1 Jane 1994 CEO
2 Tom 1978 HR
2 Tom 1979 Sales
2 Tom 1980 PR
2 Tom 1981 Boss
3 Jim 1981 Sales
3 Jim 1982 Sales
3 Jim 1983 PR
The wanted output is:
id name year position
1 Jane 1992 Sales
1 Jane 1993 Boss
1 Jane 1994 CEO
2 Tom 1978 HR
2 Tom 1979 Sales
2 Tom 1980 PR
2 Tom 1981 Boss
3 Jim 1982 Sales
3 Jim 1983 PR
Would there be a way to code this? I tried the following but did not work:
new<-ddply(df, df$position=="Sales", function(df) return(df[df$year==max(df$year),]))