3

After performing unwind in my aggreagate pipeline I have intermediate result such as :

[
{_id:1, precision:0.91, recall:0.71, other fields...},
{_id:1, precision:0.71, recall:0.81, other fields...},
{_id:1, precision:0.61, recall:0.91, other fields...},
{_id:2, precision:0.82, recall:0.42, other fields...},
{_id:2, precision:0.72, recall:0.52, other fields...},
{_id:2, precision:0.62, recall:0.62, other fields...}
]

now I would like to group the documents by _id, then in each group find the document with maximum recall, and obtain recall, precison and _id for this document.

So the result would be :

[
    {_id:1, precisionOfDocWithMaxRecall:0.61, maxRecall:0.91},
    {_id:2, precisionOfDocWithMaxRecall:0.62, maxRecall:0.62}
]

I have managed to obtain the result using group and max but without the precision field.

Neil Lunn
  • 130,590
  • 33
  • 275
  • 280
Marcel
  • 915
  • 1
  • 10
  • 28

1 Answers1

6

You could run the following pipeline, it uses the $sort operator to order the documents getting into the $group pipeline first and then uses the $first (or $last, depending on the sort direction) to return the first/last element in the ordered list:

db.collection.aggregate([
    /* previous pipeline */
    { "$sort": { "recall": -1 } },
    { 
        "$group": {
            "_id": "$_id",
            "precisionOfDocWithMaxRecall": { "$first": "$precision" },
            "maxRecall": { "$first": "$recall" }
        }
    }
])
chridam
  • 88,008
  • 19
  • 188
  • 202