I want to create an ID column that identifies two groups of observations based on the sum of each value as they relate to a target value
Say I have this dataset:
id <- rep(1:5)
val <- c(1, 2, 4, 5, 6)
dat <- data.frame(id, val)
I calculate the sum of val
(=18) and divide by 2 (=9). I then want to create an ID column that groups observations so that their sum is equal (or is as close to as possible) to 9. This new column would then be:
dat$group_id <- c(A, A, B, B, A)
Is there a good way to automate this process for many groups of observations, assuming that in some cases there is not an exact way to group observations to reach the target value?