I want to group by data set based on some IDs, then leave the grouped data that has largest value in the column. Here is a description of my data set.
BSTN ASTN1 BSTN2 ASTN2 BSTN3 ASTN3 BSTN4 ASTN4 BSTN5 ASTN TRNID TRNID2 TRNID3 TRNID4 TRNID5 count
1 150 0 0 0 0 0 0 0 0 152 1674 0 0 0 0 1
2 150 0 0 0 0 0 0 0 0 152 1676 0 0 0 0 2
3 150 0 0 0 0 0 0 0 0 152 1678 0 0 0 0 2
4 150 0 0 0 0 0 0 0 0 152 1680 0 0 0 0 13
5 150 0 0 0 0 0 0 0 0 152 1682 0 0 0 0 3
6 150 0 0 0 0 0 0 0 0 152 1684 0 0 0 0 4
I want to group and summarise this data into a single row based on IDs the first 10 columns BSTN ASTN1 BSTN2 ASTN2 BSTN3 ASTN3 BSTN4 ASTN4 BSTN5 ASTN.
Then for the rest of the columns, TRNID TRNID2 TRNID3 TRNID4 TRNID5 I would like to replace them with the row with maximum value in column count.
What I want as my final output would look as below.
BSTN ASTN1 BSTN2 ASTN2 BSTN3 ASTN3 BSTN4 ASTN4 BSTN5 ASTN TRNID TRNID2 TRNID3 TRNID4 TRNID5 count
150 0 0 0 0 0 0 0 0 152 1680 0 0 0 0 13
How would summarise my data? I have 2,931,959 rows with more groups of BSTN, ASTNs.
dput(head(A_Routetable2))
structure(list(BSTN = c(150, 150, 150, 150, 150, 150), ASTN1 = c(0,
0, 0, 0, 0, 0), BSTN2 = c(0, 0, 0, 0, 0, 0), ASTN2 = c(0, 0,
0, 0, 0, 0), BSTN3 = c(0, 0, 0, 0, 0, 0), ASTN3 = c(0, 0, 0,
0, 0, 0), BSTN4 = c(0, 0, 0, 0, 0, 0), ASTN4 = c(0, 0, 0, 0,
0, 0), BSTN5 = c(0, 0, 0, 0, 0, 0), ASTN = c(152, 152, 152, 152,
152, 152), TRNID = c(1674, 1676, 1678, 1680, 1682, 1684), TRNID2 = c(0,
0, 0, 0, 0, 0), TRNID3 = c(0, 0, 0, 0, 0, 0), TRNID4 = c(0, 0,
0, 0, 0, 0), TRNID5 = c(0, 0, 0, 0, 0, 0), count = c(1L, 2L,
2L, 13L, 3L, 4L)), row.names = c(NA, -6L), groups = structure(list(
BSTN = c(150, 150, 150, 150, 150, 150), ASTN1 = c(0, 0, 0,
0, 0, 0), BSTN2 = c(0, 0, 0, 0, 0, 0), ASTN2 = c(0, 0, 0,
0, 0, 0), BSTN3 = c(0, 0, 0, 0, 0, 0), ASTN3 = c(0, 0, 0,
0, 0, 0), BSTN4 = c(0, 0, 0, 0, 0, 0), ASTN4 = c(0, 0, 0,
0, 0, 0), BSTN5 = c(0, 0, 0, 0, 0, 0), ASTN = c(152, 152,
152, 152, 152, 152), TRNID = c(1674, 1676, 1678, 1680, 1682,
1684), TRNID2 = c(0, 0, 0, 0, 0, 0), TRNID3 = c(0, 0, 0,
0, 0, 0), TRNID4 = c(0, 0, 0, 0, 0, 0), .rows = structure(list(
1L, 2L, 3L, 4L, 5L, 6L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, 6L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))