Skip to content

Aggregate

Aggregation operations process multiple documents and return computed results. You can use aggregation operations to:

  • Group values from multiple documents together.
  • Perform operations on the grouped data to return a single result.
  • Analyze data changes over time.

Aggregation Pipelines

An aggregation pipeline consists of one or more stages that process documents:

  • Each stage performs an operation on the input documents. For example, a stage can filter documents, group documents, and calculate values.
  • The documents that are output from a stage are passed to the next stage.
  • An aggregation pipeline can return results for groups of documents. For example, return the total, average, maximum, and minimum values.

Read MongoDB aggregation for more details about aggregation.

Database

We will start with the same database structure and the same amount of data as previously.

Aggregate data using MongoDB Console

use test_db
db.player.aggregate([{"$group": {"_id": "$country_code", "total_player": {"$sum": 1}}}])

The returned data will be:

{_id: 'ARG', total_player: 3}
{_id: 'BRA', total_player: 3}
{_id: 'ENG', total_player: 3}
...

Aggregate data using MongoDB-ODM

To aggregate over the database we will use the classmethod aggregate.

The aggregate method will return an Iterator.

With find, we can't use array index-like access on the returned data. We should loop over the data.

# Code omitted above

def number_of_player_by_country():
    pipeline = [
        {"$group": {"_id": "$country_code", "total_player": {"$sum": 1}}},
        {"$sort": {"_id": 1}},
    ]
    player_count = Player.aggregate(pipeline=pipeline)
    for obj in player_count:
        print(obj)
    print()

# Code omitted below
Full file preview
import os
from typing import Optional

from mongodb_odm import ASCENDING, Document, IndexModel, apply_indexes, connect


class Player(Document):
    name: str
    country_code: str
    rating: Optional[int] = None

    class ODMConfig(Document.ODMConfig):
        indexes = [
            IndexModel([("rating", ASCENDING)]),
        ]


def configuration():
    connect(os.environ.get("MONGO_URL", "mongodb://localhost:27017/testdb"))
    apply_indexes()


def create_players():
    Player(name="Pelé", country_code="BRA", rating=98).create()
    Player(name="Diego Maradona", country_code="ARG", rating=97).create()
    Player(name="Zinedine Zidane", country_code="FRA", rating=94).create()
    Player(name="Ronaldo", country_code="BRA", rating=94).create()
    Player(name="Neymar", country_code="BRA", rating=89).create()
    Player(name="Lionel Messi", country_code="ARG", rating=91).create()
    Player(name="Ángel Di María", country_code="ARG", rating=84).create()
    Player(name="Karim Benzema", country_code="FRA", rating=89).create()
    Player(name="Antoine Griezmann", country_code="FRA", rating=85).create()
    Player(name="Kylian Mbappé", country_code="FRA", rating=91).create()
    Player(name="Gerd Müller", country_code="GER").create()
    Player(name="Miroslav Klose", country_code="GER", rating=91).create()
    Player(name="Thomas Müller", country_code="GER", rating=87).create()
    Player(name="Cristiano Ronaldo", country_code="POR", rating=87).create()
    Player(name="Eusébio", country_code="POR", rating=93).create()
    Player(name="Diogo Jota", country_code="POR", rating=85).create()
    Player(name="David Beckham", country_code="ENG", rating=89).create()
    Player(name="Wayne Rooney", country_code="ENG", rating=80).create()
    Player(name="Harry Kane", country_code="ENG", rating=89).create()


def number_of_player_by_country():
    pipeline = [
        {"$group": {"_id": "$country_code", "total_player": {"$sum": 1}}},
        {"$sort": {"_id": 1}},
    ]
    player_count = Player.aggregate(pipeline=pipeline)
    for obj in player_count:
        print(obj)
    print()


def main():
    configuration()
    create_players()

    number_of_player_by_country()


if __name__ == "__main__":
    main()

Checkout the Console

After executing the function, the console data should look like this:

ODMObj(_id='ARG', total_player=3)
ODMObj(_id='BRA', total_player=3)
ODMObj(_id='ENG', total_player=3)
...

Check that the MongoDB console aggregate and ODM aggregate method are both returning the same data.

ODMObj type

You may notice that our object has returned a new type of object called ODMObj.

When we aggregate something from the database, the returned data can be any type of object.

The MongoDB-ODM aggregate uses the PyMongo aggregate function directly. And PyMongo aggregate returns a dictionary-type iterator.

So we convert the dictionary-like objects to ODMObj where we can access data like a model.

We can easily convert ODMObj to a python dict by calling .dict().