Aggregate¶
Aggregation operations process multiple documents and return computed results. You can use aggregation operations to:
- Group values from multiple documents together.
- Perform operations on the grouped data to return a single result.
- Analyze data changes over time.
Aggregation Pipelines¶
An aggregation pipeline consists of one or more stages that process documents:
- Each stage performs an operation on the input documents. For example, a stage can filter documents, group documents, and calculate values.
- The documents that are output from a stage are passed to the next stage.
- An aggregation pipeline can return results for groups of documents. For example, return the total, average, maximum, and minimum values.
Read MongoDB aggregation for more details about aggregation.
Database¶
We will start with the same database structure and the same amount of data as previously.
Aggregate data using MongoDB Console¶
use test_db
db.player.aggregate([{"$group": {"_id": "$country_code", "total_player": {"$sum": 1}}}])
The returned data will be:
{_id: 'ARG', total_player: 3}
{_id: 'BRA', total_player: 3}
{_id: 'ENG', total_player: 3}
...
Aggregate data using MongoDB-ODM¶
To aggregate over the database we will use the classmethod aggregate.
The aggregate method will return an Iterator.
With find, we can't use array index-like access on the returned data. We should loop over the data.
# Code omitted above
def number_of_player_by_country():
pipeline = [
{"$group": {"_id": "$country_code", "total_player": {"$sum": 1}}},
{"$sort": {"_id": 1}},
]
player_count = Player.aggregate(pipeline=pipeline)
for obj in player_count:
print(obj)
print()
# Code omitted below
Full file preview
import os
from typing import Optional
from mongodb_odm import ASCENDING, Document, IndexModel, apply_indexes, connect
class Player(Document):
name: str
country_code: str
rating: Optional[int] = None
class ODMConfig(Document.ODMConfig):
indexes = [
IndexModel([("rating", ASCENDING)]),
]
def configuration():
connect(os.environ.get("MONGO_URL", "mongodb://localhost:27017/testdb"))
apply_indexes()
def create_players():
Player(name="Pelé", country_code="BRA", rating=98).create()
Player(name="Diego Maradona", country_code="ARG", rating=97).create()
Player(name="Zinedine Zidane", country_code="FRA", rating=94).create()
Player(name="Ronaldo", country_code="BRA", rating=94).create()
Player(name="Neymar", country_code="BRA", rating=89).create()
Player(name="Lionel Messi", country_code="ARG", rating=91).create()
Player(name="Ángel Di María", country_code="ARG", rating=84).create()
Player(name="Karim Benzema", country_code="FRA", rating=89).create()
Player(name="Antoine Griezmann", country_code="FRA", rating=85).create()
Player(name="Kylian Mbappé", country_code="FRA", rating=91).create()
Player(name="Gerd Müller", country_code="GER").create()
Player(name="Miroslav Klose", country_code="GER", rating=91).create()
Player(name="Thomas Müller", country_code="GER", rating=87).create()
Player(name="Cristiano Ronaldo", country_code="POR", rating=87).create()
Player(name="Eusébio", country_code="POR", rating=93).create()
Player(name="Diogo Jota", country_code="POR", rating=85).create()
Player(name="David Beckham", country_code="ENG", rating=89).create()
Player(name="Wayne Rooney", country_code="ENG", rating=80).create()
Player(name="Harry Kane", country_code="ENG", rating=89).create()
def number_of_player_by_country():
pipeline = [
{"$group": {"_id": "$country_code", "total_player": {"$sum": 1}}},
{"$sort": {"_id": 1}},
]
player_count = Player.aggregate(pipeline=pipeline)
for obj in player_count:
print(obj)
print()
def main():
configuration()
create_players()
number_of_player_by_country()
if __name__ == "__main__":
main()
Checkout the Console¶
After executing the function, the console data should look like this:
ODMObj(_id='ARG', total_player=3)
ODMObj(_id='BRA', total_player=3)
ODMObj(_id='ENG', total_player=3)
...
Check that the MongoDB console aggregate and ODM aggregate method are both returning the same data.
ODMObj type¶
You may notice that our object has returned a new type of object called ODMObj.
When we aggregate something from the database, the returned data can be any type of object.
The MongoDB-ODM aggregate uses the PyMongo aggregate function directly. And PyMongo aggregate returns a dictionary-type iterator.
So we convert the dictionary-like objects to ODMObj where we can access data like a model.
We can easily convert ODMObj to a python dict by calling .dict().