0

Say I have a data frame like mydf in the MWE below:

set.seed(1)
ids <- rep(paste(sample(LETTERS, 10), sample(1:100, 10), sep=''), c(34,56,12,98,23,13,24,45,10,21))
feats <- paste(sample(letters, length(ids), replace=TRUE), sample(letters, length(ids), replace=TRUE), sample(1:1000, length(ids)), sep='')
perc <- sample(seq(1,100,0.01), length(ids), replace=TRUE)
mydf <- data.frame(ID=ids, FEATURE=feats, ABUNDANCE=perc)
mydf

That looks like:

> mydf
     ID FEATURE ABUNDANCE
1   G21   yw821     34.98
2   G21   fc599     70.80
3   G21   qx425     59.56
4   G21   dm560     47.47
5   G21   gc790     34.30
6   G21   ki168     96.82
7   G21   av971     64.94
8   G21   jh474     20.43
9   G21   wp930     36.36
10  G21   iv901     51.79

How can I make a subset of it, to obtain the top X (5 for example) most abundant FEATURES per ID? I feel it should be pretty easy, but I can't wrap my head about a simple way to do it... Thanks!

DaniCee
  • 1,635
  • 5
  • 22
  • 44

0 Answers0