You want the $switch
statement in an aggregation pipeline:
db.collection.aggregate([
{ "$group": {
"_id": {
"$let": {
"vars": {
"length": {
"$cond": {
"if": { "$isArray": "$ARRAY_NODE" },
"then": { "$size": "$ARRAY_NODE" },
"else": 0
}
}
},
"in": {
"$switch": {
"branches": [
{ "case": { "$lt": [ "$$length", 50 ] }, "then": "< 50" },
{ "case": { "$lt": [ "$$length", 100 ] }, "then": ">= 50 < 100" },
{ "case": { "$lt": [ "$$length", 150 ] }, "then": ">= 100 < 150" },
{ "case": { "$lt": [ "$$length", 200 ] }, "then": ">= 150 < 200" },
],
"default": "> 200"
}
}
}
},
"count": { "$sum": 1 }
}}
])
As shown we shorten the syntax by declaring first in $let
, in order to get the $size
of the array for each count.
There is a $bucket
aggregation pipeline stage that is basically a "shortcut" for creating a similar statement, but it's usage is a "convenience" so it's not exactly the same output:
db.collection.aggregate([
{ "$bucket": {
"groupBy": {
"$cond": {
"if": { "$isArray": "$ARRAY_NODE" },
"then": { "$size": "$ARRAY_NODE" },
"else": 0
}
},
"boundaries": [0, 49, 99, 149, 199],
"default": "> 200"
}}
])
If you really need the full labels, then use the full form with $switch
. Also note that the "ranges" are of course also an convenience function and therefore follow their own strict logic. When your logic differs it is again better to write the $switch
in full.
Also note that where no such field is actually present in the document, then it is necessary to use $isArray
or $ifNull
in earlier versions ( as shown later ) in order to do a logical test to return a default value of 0
to indicate the "length". Otherwise the $size
operator, which is expecting an array will produce an error:
The argument to $size must be an Array, but was of type: EOO
Either handle logically or return an empty array to $size
as is demonstrated on the answer to : The argument to $size must be an Array, but was of type: EOO
This was always and still is possible using the $cond
operator as well, but it's just that the syntax as a "ternary" operator, is nested rather than the cleaner form with $switch
db.collection.aggregate([
{ "$group": {
"_id": {
"$let": {
"vars": {
"length": {
"$cond": {
"if": { "$ifNull": [ "$ARRAY_NODE", false ] },
"then": { "$size": "$ARRAY_NODE" },
"else": 0
}
}
},
"in": {
"$cond": {
"if": { "$lt": [ "$$length", 50 ] },
"then": "< 50",
"else": {
"$cond": {
"if": { "$lt": [ "$$length", 100 ] },
"then": "> 50 < 100",
"else": {
"$cond": {
"if": { "$lt": [ "$$length", 150 ] },
"then": ">100 < 150",
"else": {
"$cond": {
"if": { "$lt": [ "$$length", 200 ] },
"then": "> 150 < 200",
"else": "> 200"
}
}
}
}
}
}
}
}
}
}
}
])
As a demonstration. Insert some documents to a collection with different array sizes:
// Insert documents with arrays of given lengths
db.collection.insertMany(
[
5,40, // < 50 count 2
70, // >= 50 < 100 count 1
120,130, // >= 100 < 150 count 2
170, // >= 150 < 200 count 1
210 // > 200 count 1
].map( n =>
({ "ARRAY_NODE": Array.apply(null,Array(n)).map(() => ({})) }) )
)
Then run any of the aggregation statements to produce the results:
{ "_id" : "< 50", "count" : 2 }
{ "_id" : ">= 50 < 100", "count" : 1 }
{ "_id" : ">= 100 < 150", "count" : 2 }
{ "_id" : ">= 150 < 200", "count" : 1 }
{ "_id" : "> 200", "count" : 1 }