0

Okay that title could probably be clearer but I'm not sure how else to word it.

Here is an example of the dataframe that I'm working with.

index | run | time_step | users
1       1        1          12
2       1        2          11
3       2        1          12
4       2        2          10
5       1        3           9
6       2        3          10
7       2        4           9
8       2        5           8
9       2        6           6
10      1        4           5
11      3        1          12
12      3        2           8

So what I want to cut the dataframe such that the only rows that are left are indices 9, 10, and 12. That is trivial in this example but the full dataset is significantly larger with a couple 10,000 runs.

How would you cut rows out in way that finds the largest value of time_step for each run and keeps that row but none of the other rows with the same run value?

edit: for clarification the results would look like this

index | run | time_step | users
9       2        6           6
10      1        4           5
12      3        2           8
Y Ahmed
  • 87
  • 1
  • 5

0 Answers0