likes
comments
collection
share

通过实例讲清楚MongoDB九种聚合操作

作者站长头像
站长
· 阅读数 22

1 文章概述

1.1 基本概念

MongoDB是一种非关系型数据库,数据最终存储为BSON(Binary JSON)类型。MongoDB包含三个重要逻辑概念:数据库、集合和文档,与关系型数据库相关概念映射如下图:

\

通过实例讲清楚MongoDB九种聚合操作

\

1.2 数据准备

1.2.1 逻辑设计

bookdb数据库包含library与book两个集合,相关字段说明如下图:

\

通过实例讲清楚MongoDB九种聚合操作

\

1.2.2 新增数据

// 创建数据库
use bookdb

// 新增图书馆
db.library.insertMany([
    {
        "_id": 1,
        "library_name": "图书馆_1"
    },
    {
        "_id": 2,
        "library_name": "图书馆_2"
    },
    {
        "_id": 3,
        "library_name": "图书馆_3"
    }
])

// 新增图书
db.book.insertMany([
    {
        "_id": 1,
        "book_type": 1,
        "book_name": "图书_1",
        "book_price": 10,
        "book_count": 100,
        "borrowers": [{
            "name": "张三",
            "id": 100
        }, {
            "name": "李四",
            "id": 200
        }],
        "library_id": 1
    },
    {
        "_id": 2,
        "book_type": 2,
        "book_name": "图书_2",
        "book_price": 20,
        "book_count": 100,
        "borrowers": [{
            "name": "张三",
            "id": 100
        }],
        "library_id": 2
    },
    {
        "_id": 3,
        "book_type": 1,
        "book_name": "图书_3",
        "book_price": 30,
        "book_count": 100,
        "borrowers": [{
            "name": "张三",
            "id": 100
        }, {
            "name": "王五",
            "id": 300
        }],
        "library_id": 2
    }
])

\

2 聚合概念

2.1 基本语法

MongoDB聚合语法第一个参数是管道:文档在上个管道处理完后传递给下个管道,第二个参数是选项:设置聚合操作特性

db.collection.aggregate(pipeline, options)

\

2.2 聚合管道

常用聚合管道有以下九种类型:

$project:对文档进行投影

$limit:输出管道内前N个文档

$skip:跳过管道内前N个文档

$sort:对文档进行排序

$out:输出管道中文档

$match:对文档进行筛选

$unwind:铺平文档中的数组字段

$lookup:对文档进行查询

$group:对文档进行分组

\

3 聚合实例

3.1 project

1表示展示某字段

0表示不展示某字段

借阅人编号和姓名拆分成两个数组

db.book.aggregate({
    $project: {
        "_id": 0,
        "book_name": 1,
        "borrowerIds": "$borrowers.id",
        "borrowerNames": "$borrowers.name"
    }
})

---------------------------------------------------------

// 1
{
    "book_name": "图书_1",
    "borrowerIds": [
        100,
        200
    ],
    "borrowerNames": [
        "张三",
        "李四"
    ]
}

// 2
{
    "book_name": "图书_2",
    "borrowerIds": [
        100
    ],
    "borrowerNames": [
        "张三"
    ]
}

// 3
{
    "book_name": "图书_3",
    "borrowerIds": [
        100,
        300
    ],
    "borrowerNames": [
        "张三",
        "王五"
    ]
}

\

3.2 limit

只展示一个投影结果

db.book.aggregate([
    {
        $project: {
            "_id": 0,
            "book_name": 1,
            "borrowerIds": "$borrowers.id",
            "borrowerNames": "$borrowers.name"
        }
    },
    {
        $limit: 1
    }
])

---------------------------------------------------------

// 1
{
    "book_name": "图书_1",
    "borrowerIds": [
        100,
        200
    ],
    "borrowerNames": [
        "张三",
        "李四"
    ]
}

\

3.3 skip

跳过一个且只展示一个投影结果

db.book.aggregate([
    {
        $project: {
            "_id": 0,
            "book_name": 1,
            "borrowerIds": "$borrowers.id",
            "borrowerNames": "$borrowers.name"
        }
    },
    {
        $skip: 1
    },
    {
        $limit: 1
    }
])

---------------------------------------------------------

// 1
{
    "book_name": "图书_2",
    "borrowerIds": [
        100
    ],
    "borrowerNames": [
        "张三"
    ]
}

\

3.4 sort

db.book.aggregate([
    {
        $project: {
            "_id": 1,
            "book_name": 1,
            "library_id": 1
        }
    },
    {
        $sort: {
            "library_id": - 1,     // 降序
            "_id": 1               // 升序
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": 2,
    "book_name": "图书_2",
    "library_id": 2
}

// 2
{
    "_id": 3,
    "book_name": "图书_3",
    "library_id": 2
}

// 3
{
    "_id": 1,
    "book_name": "图书_1",
    "library_id": 1
}

\

MongoDB内存排序有100M限制,如果排序数据过多需要设置选项allowDiskUse=true,表示数据可以写入临时文件进行排序

db.book.aggregate([
    {
        $project: {
            "_id": 1,
            "book_name": 1,
            "library_id": 1
        }
    },
    {
        $sort: {
            "library_id": - 1,
            "_id": 1
        }
    }
], {
    allowDiskUse: true
})

\

3.5 out

投影结果输出到新集合

db.book.aggregate([
    {
        $project: {
            "_id": 0,
            "book_name": 1,
            "borrowerIds": "$borrowers.id",
            "borrowerNames": "$borrowers.name"
        }
    },
    {
        $out: "newCollection"
    }
])

db.newCollection.find()

---------------------------------------------------------

// 1
{
    "_id": ObjectId("62bec0636f9c37787b9590b9"),
    "book_name": "图书_1",
    "borrowerIds": [
        100,
        200
    ],
    "borrowerNames": [
        "张三",
        "李四"
    ]
}

// 2
{
    "_id": ObjectId("62bec0636f9c37787b9590ba"),
    "book_name": "图书_2",
    "borrowerIds": [
        100
    ],
    "borrowerNames": [
        "张三"
    ]
}

// 3
{
    "_id": ObjectId("62bec0636f9c37787b9590bb"),
    "book_name": "图书_3",
    "borrowerIds": [
        100,
        300
    ],
    "borrowerNames": [
        "张三",
        "王五"
    ]
}

\

3.6 match

where book_name = xxx

db.book.aggregate([
    {
        $match: {
            "book_name": "图书_2"
        }
    },
    {
        $project: {
            "_id": 1,
            "book_name": 1,
            "library_id": 1
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": 2,
    "book_name": "图书_2",
    "library_id": 2
}

\

where library_id = 2 and price > 15

db.book.aggregate([
    {
        $match: {
            $and: [
                {
                    "library_id": 2
                },
                {
                    "book_price": {
                        $gt: 25
                    }
                }
            ]
        }
    },
    {
        $project: {
            "_id": 1,
            "book_name": 1,
            "library_id": 1,
            "book_price": 1
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": 3,
    "book_name": "图书_3",
    "book_price": 30,
    "library_id": 2
}

\

3.7 unwind

文档按照借阅人数组铺平

includeArrayIndex表示索引

db.book.aggregate([
    {
        $unwind: {
            path: "$borrowers",
            includeArrayIndex: "idx"
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": 1,
    "book_type": 1,
    "book_name": "图书_1",
    "book_price": 10,
    "book_count": 100,
    "borrowers": {
        "name": "张三",
        "id": 100
    },
    "library_id": 1,
    "idx": NumberLong("0")
}

// 2
{
    "_id": 1,
    "book_type": 1,
    "book_name": "图书_1",
    "book_price": 10,
    "book_count": 100,
    "borrowers": {
        "name": "李四",
        "id": 200
    },
    "library_id": 1,
    "idx": NumberLong("1")
}

// 3
{
    "_id": 2,
    "book_type": 2,
    "book_name": "图书_2",
    "book_price": 20,
    "book_count": 100,
    "borrowers": {
        "name": "张三",
        "id": 100
    },
    "library_id": 2,
    "idx": NumberLong("0")
}

// 4
{
    "_id": 3,
    "book_type": 1,
    "book_name": "图书_3",
    "book_price": 30,
    "book_count": 100,
    "borrowers": {
        "name": "张三",
        "id": 100
    },
    "library_id": 2,
    "idx": NumberLong("0")
}

// 5
{
    "_id": 3,
    "book_type": 1,
    "book_name": "图书_3",
    "book_price": 30,
    "book_count": 100,
    "borrowers": {
        "name": "王五",
        "id": 300
    },
    "library_id": 2,
    "idx": NumberLong("1")
}

\

3.8 lookup

查询图书馆有哪些图书

lookup可以实现连表查询

MongoDB 3.4之前聚合语法:

  • from:待关联集合【book】
  • localField: 本集合关联键【library】
  • foreignField:待关联键【book】
  • as:待关联集合数据【book】
db.library.aggregate([
    {
        $lookup: 
        {
            from: "book",
            localField: "_id",
            foreignField: "library_id",
            as: "books_info"
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": 1,
    "library_name": "图书馆_1",
    "books_info": [
        {
            "_id": 1,
            "book_type": 1,
            "book_name": "图书_1",
            "book_price": 10,
            "book_count": 100,
            "borrowers": [
                {
                    "name": "张三",
                    "id": 100
                },
                {
                    "name": "李四",
                    "id": 200
                }
            ],
            "library_id": 1
        }
    ]
}

// 2
{
    "_id": 2,
    "library_name": "图书馆_2",
    "books_info": [
        {
            "_id": 2,
            "book_type": 2,
            "book_name": "图书_2",
            "book_price": 20,
            "book_count": 100,
            "borrowers": [
                {
                    "name": "张三",
                    "id": 100
                }
            ],
            "library_id": 2
        },
        {
            "_id": 3,
            "book_type": 1,
            "book_name": "图书_3",
            "book_price": 30,
            "book_count": 100,
            "borrowers": [
                {
                    "name": "张三",
                    "id": 100
                },
                {
                    "name": "王五",
                    "id": 300
                }
            ],
            "library_id": 2
        }
    ]
}

// 3
{
    "_id": 3,
    "library_name": "图书馆_3",
    "books_info": [ ]
}

\

MongoDB 3.4之后聚合语法:

  • from:待关联集合【book】
  • let:声明本集合字段在管道使用
  • pipeline:操作管道
db.library.aggregate([
    {
        $lookup: 
        {
            from: "book",
            let: {
                "lid": "$_id"
            },
            pipeline: [
                {
                    $match: {
                        $expr: {
                            $and: [
                                {
                                    $eq: ["$$lid", "$library_id"]
                                }
                            ]
                        }
                    }
                }
            ],
            as: "books_info"
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": 1,
    "library_name": "图书馆_1",
    "books_info": [
        {
            "_id": 1,
            "book_type": 1,
            "book_name": "图书_1",
            "book_price": 10,
            "book_count": 100,
            "borrowers": [
                {
                    "name": "张三",
                    "id": 100
                },
                {
                    "name": "李四",
                    "id": 200
                }
            ],
            "library_id": 1
        }
    ]
}

// 2
{
    "_id": 2,
    "library_name": "图书馆_2",
    "books_info": [
        {
            "_id": 2,
            "book_type": 2,
            "book_name": "图书_2",
            "book_price": 20,
            "book_count": 100,
            "borrowers": [
                {
                    "name": "张三",
                    "id": 100
                }
            ],
            "library_id": 2
        },
        {
            "_id": 3,
            "book_type": 1,
            "book_name": "图书_3",
            "book_price": 30,
            "book_count": 100,
            "borrowers": [
                {
                    "name": "张三",
                    "id": 100
                },
                {
                    "name": "王五",
                    "id": 300
                }
            ],
            "library_id": 2
        }
    ]
}

// 3
{
    "_id": 3,
    "library_name": "图书馆_3",
    "books_info": [ ]
}

\

新增价格大于20查询条件

db.library.aggregate([
    {
        $lookup: 
        {
            from: "book",
            let: {
                "lid": "$_id"
            },
            pipeline: [
                {
                    $match: {
                        $expr: {
                            $and: [
                                {
                                    $eq: ["$$lid", "$library_id"]
                                },
                                {
                                    $gt: ["$book_price", 20]
                                }
                            ]
                        }
                    }
                }
            ],
            as: "books_info"
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": 1,
    "library_name": "图书馆_1",
    "books_info": [ ]
}

// 2
{
    "_id": 2,
    "library_name": "图书馆_2",
    "books_info": [
        {
            "_id": 3,
            "book_type": 1,
            "book_name": "图书_3",
            "book_price": 30,
            "book_count": 100,
            "borrowers": [
                {
                    "name": "张三",
                    "id": 100
                },
                {
                    "name": "王五",
                    "id": 300
                }
            ],
            "library_id": 2
        }
    ]
}

// 3
{
    "_id": 3,
    "library_name": "图书馆_3",
    "books_info": [ ]
}

\

3.9 group

3.9.1 简单统计

_id:图书类型作为分组键

count:每个类型有多少种书

db.book.aggregate([
    {
        $group: {
            _id: "$book_type",
            count: {
                $sum: 1
            }
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": 2,
    "count": 1
}

// 2
{
    "_id": 1,
    "count": 2
}

\

3.9.2 复杂统计

_id:图书类型作为分组键

type_count:每个类型有多少种书

type_book_count:每个类型有多少本书

minTotalPrice:每个类型总价最小值

maxTotalPrice:每个类型总价最大值

totalPrice:每个类型总价

avgPrice:每个类型平均价

db.book.aggregate([
    {
        $group: {
            _id: "$book_type",
            type_count: {
                $sum: 1
            },
            type_book_count: {
                $sum: "$book_count"
            },
            minTotalPrice: {
                $min: {
                    $multiply: ["$book_price", "$book_count"]
                }
            },
            maxTotalPrice: {
                $max: {
                    $multiply: ["$book_price", "$book_count"]
                }
            },
            totalPrice: {
                $sum: {
                    $multiply: ["$book_price", "$book_count"]
                }
            },
            avgPrice: {
                $avg: "$book_price"
            }
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": 2,
    "type_count": 1,
    "type_book_count": 100,
    "minTotalPrice": 2000,
    "maxTotalPrice": 2000,
    "totalPrice": 2000,
    "avgPrice": 20
}

// 2
{
    "_id": 1,
    "type_count": 2,
    "type_book_count": 200,
    "minTotalPrice": 1000,
    "maxTotalPrice": 3000,
    "totalPrice": 4000,
    "avgPrice": 20
}

\

3.9.3 空分组键

_id:空分组键表示统计全量数据

db.book.aggregate([
    {
        $group: {
            _id: null,
            type_count: {
                $sum: 1
            },
            type_book_count: {
                $sum: "$book_count"
            },
            minTotalPrice: {
                $min: {
                    $multiply: ["$book_price", "$book_count"]
                }
            },
            maxTotalPrice: {
                $max: {
                    $multiply: ["$book_price", "$book_count"]
                }
            },
            totalPrice: {
                $sum: {
                    $multiply: ["$book_price", "$book_count"]
                }
            },
            avgPrice: {
                $avg: "$book_price"
            }
        }
    }
])

---------------------------------------------------------

// 1
{
    "_id": null,
    "type_count": 3,
    "type_book_count": 300,
    "minTotalPrice": 1000,
    "maxTotalPrice": 3000,
    "totalPrice": 6000,
    "avgPrice": 20
}

\

4 文章总结

第一介绍了MongoDB与关系型数据库关系,并且准备本文需要的测试数据,第二介绍了聚合语法和聚合管道相关概念,第三通过实例介绍了如何使用聚合操作,希望本文对大家有所帮助。