分组

134
2023/07/29 14:48:24

本章将讨论 PostgreSQL 的 GROUP BY 子句:将SELECT 语句返回的结果进行分组;对于每个分组,可以使用聚合函数。同时讨论与之相关的 HAVINGCUBEROLLUP 等。

GROUP BY

基本语法:

SELECT
	字段1,
	字段2,
	...,
	聚合函数(字段3)
FROM
	表名
GROUP BY
	字段1,
	字段2,
	...
HAVING
	条件;

注意:数据库服务器执行的顺序是:

FROM -> WHERE -> GROUP BY -> HAVING -> SELECT -> DISTINCT -> ORDER BY -> LIMIT

-- 没用使用聚合函数:相当于 SELECT DISTINCT
SELECT customer_id FROM payment GROUP BY customer_id;

-- 使用聚合函数 SUM():求和
SELECT customer_id, SUM(amount) FROM payment GROUP BY customer_id;

-- 使用聚合函数 SUM()求和并排序
SELECT customer_id, SUM(amount) FROM payment GROUP BY customer_id ORDER BY SUM(amount) DESC;

-- 与 INNER JOIN 一起使用
SELECT
	first_name || ' ' || last_name AS full_name,
	SUM (amount) AS amount
FROM
	payment
INNER JOIN customer USING (customer_id)    	
GROUP BY
	full_name
ORDER BY amount DESC;

-- 使用聚合函数:COUNT()统计
SELECT staff_id, COUNT(payment_id) FROM payment GROUP BY staff_id;

如果两张表连接的字段名相同,可以使用 USING 来简化 ON 的写法

HAVING

HAVINGSELECT之前执行,所以不能在 HAVING 中使用 SELECT 指定的字段别名。

WHERE 用于数据行,HAVING 用于分组行。

-- 选择支出超过200的客户
SELECT
	customer_id,
	SUM (amount)
FROM
	payment
GROUP BY
	customer_id
HAVING
	SUM (amount) > 200;
	
-- 选择超过300名客户的门店
SELECT
	store_id,
	COUNT (customer_id)
FROM
	customer
GROUP BY
	store_id
HAVING
	COUNT (customer_id) > 300;

CUBE

CUBEGROUP BY 的子句,用于生成多个分组集。

CUBE(c1, c2, c3) 将生成所有八个可能的分组集:

(c1, c2, c3)
(c1, c2)
(c2, c3)
(c1,c3)
(c1)
(c2)
(c3)
()

示例数据:

DROP TABLE IF EXISTS sales;
CREATE TABLE sales (
    brand VARCHAR NOT NULL,
    segment VARCHAR NOT NULL,
    quantity INT NOT NULL,
    PRIMARY KEY (brand, segment)
);

INSERT INTO sales (brand, segment, quantity)
VALUES
    ('ABC', 'Premium', 100),
    ('ABC', 'Basic', 200),
    ('XYZ', 'Premium', 100),
    ('XYZ', 'Basic', 300);

示例:

-- 示例1
SELECT
    brand,
    segment,
    SUM (quantity)
FROM
    sales
GROUP BY
    CUBE (brand, segment)
ORDER BY
    brand,
    segment;

-- 示例2
SELECT
    brand,
    segment,
    SUM (quantity)
FROM
    sales
GROUP BY
    brand,
    CUBE (segment)
ORDER BY
    brand,
    segment;

ROLLUP

CUBE 不同,ROLLUP 不会根据指定的字段生成所有可能的分组集,它只是其中一个子集。

它假定输入字段之间存在层次结构,并生成考虑层次结构有意义的所有分组集。所以它经常用来生成报表的小计/总计。

ROLLUP(c1, c2, c3) 仅生成4个分组集,并假设层次结构是:c1 > c2 > c3

(c1, c2, c3)
(c1, c2)
(c1)
()

示例数据:

DROP TABLE IF EXISTS sales;
CREATE TABLE sales (
    brand VARCHAR NOT NULL,
    segment VARCHAR NOT NULL,
    quantity INT NOT NULL,
    PRIMARY KEY (brand, segment)
);

INSERT INTO sales (brand, segment, quantity)
VALUES
    ('ABC', 'Premium', 100),
    ('ABC', 'Basic', 200),
    ('XYZ', 'Premium', 100),
    ('XYZ', 'Basic', 300);

示例:

-- 查找按品牌小计以及所有品牌和市场总计销售产品的数量
SELECT
    brand,
    segment,
    SUM (quantity)
FROM
    sales
GROUP BY
    ROLLUP (brand, segment)
ORDER BY
    brand,
    segment;

-- 查找每天、每月、和每年的租赁次数
SELECT
    EXTRACT (YEAR FROM rental_date) y,
    EXTRACT (MONTH FROM rental_date) M,
    EXTRACT (DAY FROM rental_date) d,
    COUNT (rental_id)
FROM
    rental
GROUP BY
    ROLLUP (
        EXTRACT (YEAR FROM rental_date),
        EXTRACT (MONTH FROM rental_date),
        EXTRACT (DAY FROM rental_date)
    );

EXTRACT () 用于从日期时间中,检索年、月、日等字段