1

I have data (dataframe) with rows that are similar 2 by 2 on one column (id) but different on other columns. Among those columns, there is a value. I would like to keep only the row with the maximum value and discard the other

For Data like below, discard rows for a given ID with the min value
ID, Sub-ID, Value
1, 1, 5 => keep
1, 4, 3 => discard
2, 6, 10 => keep
2, 4, 1 => discard
3, 9, 0 => discard
3, 1, 1 => keep
..

Olivier
  • 15
  • 1
  • 6

1 Answers1

0

An option would be to group by 'ID' and slice the row where the 'Value' is the max

library(dplyr)
df1 %>%
   group_by(ID) %>%
   slice(which.max(Value))
# A tibble: 3 x 3
# Groups:   ID [3]
#     ID Sub_ID Value
#  <int>  <int> <dbl>
#1     1      1     5
#2     2      6    10
#3     3      1     1

If it is to remove the min 'Value'

df1 %>%
  group_by(ID) %>%
  slice(-which.min(Value))
akrun
  • 674,427
  • 24
  • 381
  • 486