This chapter introduces the MongoDB Aggregation Pipeline, which is mainly used for statistical analysis, similar to the group by statement in SQL. In MongoDB shell, statistical analysis is achieved through the db.collection.aggregate() function.

Concept

The Aggregation Pipeline is an abstract concept that treats data like water flowing through a pipeline. We can process the data in multiple stages within the pipeline. After one stage processes the data, the processed results are passed to the next stage for further processing.

Examples of applying the aggregation pipeline in statistical analysis:

  • In the first stage, a batch of document data is retrieved from the collection based on certain conditions.
  • In the second stage, the retrieved document data is grouped and aggregated.
  • In the third stage, the aggregated data from the second stage is sorted.

The entire process is like a pipeline operation where data flows in from one end of the pipeline, undergoes processing through several stages, and then outputs a final result.

MongoDB Pipeline Supported Operators

Below are common pipeline operators supported by MongoDB:

Operator Description
$match The $match stage is used to filter document data based on conditions, similar to the where condition in SQL.
$group The $group stage is used to group and aggregate document data, similar to the group by clause in SQL.
$sort Used to sort the data.

Steps for Using Aggregation Pipeline

Typically, three steps are involved:

  • Filter the target data using $match.
  • Group and aggregate the data using $group.
  • Sort the results using $sort (optional).

Example:

db.orders.aggregate([
                     { $match: { status: "A" } },
                     { $group: { _id: "$cust_id", total: { $sum: "$amount" } } },
                     { $sort: { total: -1 } }
                   ])

Equivalent SQL:

select sum(amount) as total from orders 
		where status="A" 
		group by cust_id 
		order by total desc

Note: For more information on aggregate statistical analysis, please refer to the subsequent chapters.