likes
comments
collection
share

Mysql-一些SQL优化的方式

作者站长头像
站长
· 阅读数 27

本人对mysql的理解不深,前一整子公司有mysql相关的分享,学到了一些东西,加上业务上有一些慢sql优化,所以做了以下的总结

准备工作

首先准备两张表,account表与user表,两表可根据account_id联合查询,两张表各插入10000条数据

CREATE TABLE `account` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `email` varchar(100) DEFAULT NULL,
  `mobile_area_code` varchar(16) DEFAULT NULL,
  `mobile` varchar(64) DEFAULT NULL,
  `password` varchar(64) DEFAULT NULL,
  `create_time` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_password` (`password`),
  KEY `idx_key_part` (`mobile`,`mobile_area_code`),
  KEY `idx_mobile_area_code_create_time`(`mobile_area_code`, `create_time`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;


CREATE TABLE `user` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `account_id` int(11) DEFAULT NULL,
  `nick_name` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_account_id` (`account_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;
delimiter $$
CREATE PROCEDURE proc_initData4()
BEGIN
    declare i int;
    set i = 1;
    set @dt='2023-06-01 00:00:00';
    WHILE i<=1000000 DO
        if i % 2 = 0 then
        INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '86', uuid(), uuid(), date_add(@dt, interval i minute));
        else
        INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '96', uuid(), uuid(), date_add(@dt, interval i minute));
        end if;
        SET i = i+1;
    END WHILE;
end $$
CALL proc_initData4();


delimiter $$
CREATE PROCEDURE proc_initData3()
BEGIN
    declare i int;
    set i = 1;
    WHILE i<=1000000 DO
       INSERT INTO user (id, `account_id`, `nick_name`) VALUES (i, i, uuid());
        SET i = i+1;
    END WHILE;
end $$
CALL proc_initData3();

explain

先看看explain中有什么关键字

mysql> explain select * from account;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9986 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
列名作用
id在一个查询中每个select关键字都对应一个唯一的id
select_typeselect关键字对应的查询类型
table表名
type每个select查询中针对该表的访问方法
possible_keys可能用到的索引
key实际用到的索引
key_len实际用到的索引长度
ref当使用索引列等值查询时,与索引列等值匹配的对象信息
rows预计需要读取的记录数
filtered按搜索条件过滤后剩余条数的占比
Extra一些额外信息
id

id值如果相同,可以认为是一组,从上往下顺序执行;在所有组中,id值越大,优先级越高,越先执行

如下图,id值相同,说明是同一组,从上往下顺序执行,出现在前面的是驱动表,出现在后面的是被驱动表,查询优化器准备让account表作为驱动表,user表作为被驱动表执行查询

mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
| id | select_type | table | partitions | type | possible_keys  | key            | key_len | ref       | rows   | filtered | Extra |
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
|  1 | SIMPLE      | a     | NULL       | ALL  | PRIMARY        | NULL           | NULL    | NULL      | 990675 |   100.00 | NULL  |
|  1 | SIMPLE      | u     | NULL       | ref  | idx_account_id | idx_account_id | 5       | test.a.id |      1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
2 rows in set, 1 warning (0.07 sec)

如下图,语句包含子查询,id值不同,值越大,优先级越高,越先执行,所以是先执行子查询,再执行account表相关的查询

mysql> explain select * from account where id in (select account_id from user where id > 9800) or password > 'a4c9429f-d81d';
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys      | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
|  1 | PRIMARY     | account | NULL       | ALL   | idx_password       | NULL    | NULL    | NULL | 9986 |   100.00 | Using where |
|  2 | SUBQUERY    | user    | NULL       | range | PRIMARY,account_id | PRIMARY | 4       | NULL |  200 |   100.00 | Using where |
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

如下图,不是所有的子查询id值都会不同,mysql有一个查询优化器,可能会对语句进行优化重写,转换为连接查询

mysql> explain select * from account where id in (select account_id from user where id > 9800) and password > 'a4c9429f-d81d';
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| id | select_type  | table       | partitions | type   | possible_keys          | key                 | key_len | ref             | rows   | filtered | Extra       |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
|  1 | SIMPLE       | account     | NULL       | ALL    | PRIMARY,idx_password   | NULL                | NULL    | NULL            | 990675 |    50.00 | Using where |
|  1 | SIMPLE       | <subquery2> | NULL       | eq_ref | <auto_distinct_key>    | <auto_distinct_key> | 5       | test.account.id |      1 |   100.00 | NULL        |
|  2 | MATERIALIZED | user        | NULL       | range  | PRIMARY,idx_account_id | PRIMARY             | 4       | NULL            | 498167 |   100.00 | Using where |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
3 rows in set, 1 warning (0.05 sec)

如下图,union 会对多个查询的结果合并起来并进行去重,所以会新建一个<union1,2>的临时表,id为null表明这个临时表是为了合并查询结果集创建的

mysql> explain select * from account where id > 9900 union select * from account where id > 9901;
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
| id | select_type  | table      | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra           |
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
|  1 | PRIMARY      | account    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where     |
|  2 | UNION        | account    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where     |
| NULL | UNION RESULT | <union1,2> | NULL       | ALL   | NULL          | NULL    | NULL    | NULL |   NULL |     NULL | Using temporary |
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
3 rows in set, 1 warning (0.04 sec)

如下图,union all不会进行去重,自然也没有<union1,2>的临时表

mysql> explain select * from account where id > 9900 union all select * from account where id > 9901;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
|  1 | PRIMARY     | account | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where |
|  2 | UNION       | account | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
select _type

下表中,上面都有例子,就不赘述了

select_type意思
SIMPLE不包含union或者子查询
PRIMARY对于包含union、union all或者子查询的大查询来说,他是由几个小查询组成的,最左边的查询的select_type就是PRIMARY
UNION对于包含union、union all的大查询,最左边的查询的select_type是PRIMARY,其余的小查询就是union
UNION RESULT使用union查询后会进行去重
SUBQUERY包含子查询,并且该子查询不会被查询优化器转化成连接查询

下图中,先看id值,id值为2的语句先执行,id值为1的有两个,说明是连接查询,注意:<subquery2>最后的2表示这个表是由id值为2的语句衍生出来的表

mysql> explain select * from account where id in (select account_id from user where id > 9900);
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| id | select_type  | table       | partitions | type   | possible_keys          | key                 | key_len | ref             | rows   | filtered | Extra       |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
|  1 | SIMPLE       | account     | NULL       | ALL    | PRIMARY                | NULL                | NULL    | NULL            | 990675 |   100.00 | NULL        |
|  1 | SIMPLE       | <subquery2> | NULL       | eq_ref | <auto_distinct_key>    | <auto_distinct_key> | 5       | test.account.id |      1 |   100.00 | NULL        |
|  2 | MATERIALIZED | user        | NULL       | range  | PRIMARY,idx_account_id | PRIMARY             | 4       | NULL            | 498167 |   100.00 | Using where |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
type
const

如下图,const表示通过索引一次就找到了,当主键或者唯一二级索引与常量进行等值匹配时,type为const

mysql> explain select * from account where id = 16;
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.02 sec)
eq_ref

如下图,eq_ref常见于主键或者唯一二级索引扫描,下图被驱动表扫描主键

mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref               | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | ALL    | account_id    | NULL    | NULL    | NULL              | 9851 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.u.account_id |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)
ref

如下图,ref常见于普通二级索引与常量进行等值匹配

mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_password  | idx_password | 403     | const |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
ref_or_null

如下图,ref_or_null常见于在ref的基础上,新增可为null的查询

mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76' or password is null;
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
| id | select_type | table   | partitions | type        | possible_keys | key          | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | account | NULL       | ref_or_null | idx_password  | idx_password | 403     | const |    2 |   100.00 | Using index condition |
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
1 row in set, 1 warning (0.01 sec)
range

范围查询,较为简单

index和All

如下图,index常见于索引树全表扫描,与All的聚簇索引全表扫描不同,index所扫描的文件较聚簇索引文件小

mysql> explain select mobile from account where mobile_area_code = '86';
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | account | NULL       | index | NULL          | idx_key_part | 470     | NULL | 9986 |    10.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
possible_keys和key

如下图,可能用到的索引为idx_password,idx_key_part,实际用到的索引为idx_password,用什么索引是mysql的查询优化器经过成本计算后决定的,所以数据库中没有用到的索引需要删除,不然查询优化器计算成本需要更多的时间

mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys             | key          | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_password,idx_key_part | idx_password | 403     | const |    1 |     5.00 | Using where |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
ref

如下图,在访问方法为eq_ref时,与索引列做等值匹配的是user表的account_id

mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref               | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | ALL    | account_id    | NULL    | NULL    | NULL              | 9851 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.u.account_id |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

如下图,等值匹配的也可以是一个函数

mysql> explain select * from account a inner join user u on a.id = upper(u.account_id);
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | ALL    | NULL          | NULL    | NULL    | NULL | 9851 |   100.00 | NULL        |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | func |    1 |   100.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
rows和filtered

如下图,rows表示此次查询预计需要扫描的行数,filter表示预计需要扫描的行数中有多少行符合我们的条件,下图中9986条都符合我们的条件,所以filtered值应该越高越好

平时联表查询的时候,我们应该多关注filtered值,filtered越高,说明驱动表扫描出来的行数大部分都是有用的,回表的次数就会变少,语句才会更高效

mysql> explain select * from account;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9986 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
Extra
Using where

如下图,使用全表扫描来执行语句,where条件中有针对某个字段的查询

mysql> explain select * from user where nick_name = 'a';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |    10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

如下图,使用某个索引来执行语句,where条件中包含了该索引列不存在的字段

mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys             | key          | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_password,idx_key_part | idx_password | 403     | const |    1 |     5.00 | Using where |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Using filesort

如下图,filesort表示在没有使用索引排序的情况下,在内存中(记录较少)或者在磁盘中(记录较多)进行排序

mysql> explain select * from user order by nick_name desc limit 10;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
Using temporary

如下图,mysql在无法使用索引的情况下,需要使用内部的临时表来执行语句,值得注意的是,以下语句extra中还有Using filesort,这是因为mysql会在group by后面加上order by

mysql> explain select nick_name, count(*) from user group by nick_name;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                           |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |   100.00 | Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
1 row in set, 1 warning (0.00 sec)
Using index

如下图,语句中查询的字段某个索引可以满足,不需要回表

mysql> explain select mobile from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_key_part  | idx_key_part | 403     | const |    1 |   100.00 | Using index |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Using index condition

没有找到合适的例子

Using join buffer

如下图,没有添加索引联表查询的情况下,会使用join buffer来提高查询速度

mysql> explain select * from account a inner join user u on u.nick_name = a.password;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9047 |   100.00 | NULL                                               |
|  1 | SIMPLE      | u     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |    10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

常见的sql优化

加索引

大部分慢sql是没有加索引或索引没有加对导致,网上关于怎么加索引的文章非常多,我就不赘述了,只提以下两点

  1. 很多文章说区分度不大的字段不宜加索引,其实有点一棍子打死的嫌疑,比如上面的account表,mobile_area_code只有86、96两个值,假如百分之90的数据是96,百分之10的数据是86,而有一个业务场景恰恰是要查86相关的数据,这个时候索引的作用就很大了,所以加索引有一些规则可以参考,但还是要看业务具体实现
  2. 多用组合索引,组合索引可以提高索引复用率,覆盖索引可以减少回表
深度分页

业务上比较常见的问题,列表翻页到比较大的页面时接口会越来越慢,sql常会变成这样

select * from account order by password limit 5000, 10;

这时可用下面这条语句优化

select * from account where id >= (select id from account order by password limit 5000, 1) limit 10;

explain一下,可以看到第一条语句全表扫描了

mysql> explain select * from account order by password limit 5000, 10;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | account | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9975 |   100.00 | Using filesort |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+

那为什么使用第二条语句就能加快查询呢?

Mysql-一些SQL优化的方式

第一条语句排序所有记录,其实我们不需要这么多记录,第二条语句只拿了我们需要的记录,时间自然就少了

分页查询还有一个容易踩的坑,用于排序的字段必须是唯一的,使用主键或者唯一的业务单号,如果排序的字段不唯一(比如创建时间),分页的过程会出现少处理或者多处理数据,严重的话会陷在死循环中出不来

强制索引

我们应该经常会遇到sql语句走错索引的情况,比如下面这条sql,本来是走idx_account_id这个索引的,最终mysql决定走主键索引

当然,这只是一个例子,并不是说下面这条sql走idx_account_id索引更快

mysql> explain select * from user where account_id between 1000 and 10002000 order by id desc limit 10;
+----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key     | key_len | ref  | rows | filtered | Extra                            |
+----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
|  1 | SIMPLE      | user  | NULL       | index | idx_account_id | PRIMARY | 4       | NULL |   20 |    50.00 | Using where; Backward index scan |
+----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
1 row in set, 1 warning (0.01 sec)

如果是在测试环境,可以使用optimizer trace 功能查看索引变更的原因,首先需要开启optimizer trace

SET optimizer_trace="enabled=on";
select * from information_schema.OPTIMIZER_TRACE;

比如上面的sql就是因为排序的原因导致索引变更

{
  "reconsidering_access_paths_for_index_ordering": {
    "clause": "ORDER BY",
    "steps": [
    ],
    "index_order_summary": {
      "table": "`user`",
      "index_provides_order": true,
      "order_direction": "desc",
      "index": "PRIMARY",
      "plan_changed": true,
      "access_type": "index"
    }
  }
}

线上一般不会开启optimizer trace 功能,测试环境的数据量又不够,所以这种分析方式比较局限

在平时的业务开发中,本人偶尔会遇到这种情况,A索引明明更快,结果走了B索引,如果是明确知道B索引更快,可以使用下面的语句强制走B索引

select * from user force index(idx_account_id) where account_id between 1000 and 10002000 order by id desc limit 10;

select /*+ INDEX(user idx_account_id) */ * from user where account_id between 1000 and 10002000 order by id desc limit 10;

强制索引的弊端

  1. 硬编码,不够优雅,代码需跟随索引名变更
  2. 滞后性,慢sql治理一般是在功能上线之后,强制索引需要重新测试、打包上线,如果是非常严重的慢sql,这种方式不够敏捷
巧用union

业务开发中常有获取不同状态位数据,然后根据创建时间排序的需求,形式大概如下

select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10;

explain一下,发现全表扫描了,没有用上索引

mysql> explain select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10;
+----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
| id | select_type | table   | partitions | type | possible_keys                    | key  | key_len | ref  | rows   | filtered | Extra                       |
+----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
|  1 | SIMPLE      | account | NULL       | ALL  | idx_mobile_area_code_create_time | NULL | NULL    | NULL | 991309 |   100.00 | Using where; Using filesort |
+----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
1 row in set, 1 warning (0.02 sec)

为了用上idx_mobile_area_code_create_time组合索引,需先固定mobile_area_code,对二级索引进行反向扫描,得出以下语句,用上了索引,执行速度也快了许多

select *
from (
  (select * from account where mobile_area_code = '86' order by mobile_area_code, create_time desc limit 10)
     union all
  (select * from account where mobile_area_code = '96' order by mobile_area_code, create_time desc limit 10)
) t
order by mobile_area_code, create_time desc
limit 10
巧用虚拟列

还是上面的account表,如果想要统计7月1日10点到10日10点,创建时间在8点和20点的国内账号,按小时汇总

大部分的同学会这样写

select DATE_FORMAT(create_time, '%H') hour, count(1) cnt from account
 where mobile_area_code= '86'
 and create_time >= '2023-07-01 10:00:00'
 and create_time <= '2023-07-10 10:00:00'
GROUP BY hour HAVING hour  in ('08','20')

explain,用上了临时表

mysql> explain select DATE_FORMAT(create_time, '%H') hour ,count(1) cnt_order  from account  where mobile_area_code= '86'  and create_time >= '2023-07-01 10:00:00'  and create_time <= '2023-07-10 10:00:00' GROUP
BY hour HAVING hour  in ('08','20');
+----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
| id | select_type | table   | partitions | type  | possible_keys                    | key                              | key_len | ref  | rows | filtered | Extra                                     |
+----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
|  1 | SIMPLE      | account | NULL       | range | idx_mobile_area_code_create_time | idx_mobile_area_code_create_time | 72      | NULL | 6481 |   100.00 | Using where; Using index; Using temporary |
+----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
1 row in set, 1 warning (0.00 sec)

加个虚拟列和索引

alter table account add column hour_ char(2)  GENERATED ALWAYS AS (DATE_FORMAT(create_time, '%H')) VIRTUAL;
alter table account add key idx_area_code_hour_create_time_4(mobile_area_code, hour_, create_time);

改写sql,用上虚拟列索引

select hour_ hour,count(1) cnt from account
where hour_ in ('08','20')
 and mobile_area_code = '86'
 and create_time >= '2023-07-01 10:00:00'
 and create_time <= '2023-07-10 10:00:00'
group by  hour_;

explain一下,临时表没了,性能提升了一点

mysql> explain select hour_ hour,count(1) cnt from account where hour_ in ('08','20')  and mobile_area_code = '86'  and create_time >= '2023-07-01 10:00:00'  and create_time <= '2023-07-10 10:00:00' group by  hour_;
+----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys                                                     | key                              | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | account | NULL       | range | idx_mobile_area_code_create_time,idx_area_code_hour_create_time_4 | idx_area_code_hour_create_time_4 | 81      | NULL |  540 |   100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
模糊转精确

业务开发中模糊查询是较难处理的,但业务上确确实实有这样的需求,很难砍,对数据库的压力又很大,本人所在的业务一般采用下面两种方式

  1. 判断字段长度转精确,字段是支持模糊查询的,但用户输入的是精确的字段,比如订单号,很多用户是复制过来的,如果符合长度,直接精确查询,不符合长度,再进行模糊查询,可以减少一定量的慢sql语句
  2. 业务上要求用户输入前面5位或后5位字符串,后端再经过中间表转换为精确字符,再进行查询

虽然但是,模糊查询的处理跟业务强相关,本人业务的方式不一定适用所有

排序优化

根据极客时间Mysql45讲第16讲的例子,order by工作方式分为两种,全字段排序、rowid排序

CREATE TABLE `t` (
 `id` int(11) NOT NULL,
 `city` varchar(16) NOT NULL,
 `name` varchar(16) NOT NULL,
 `age` int(11) NOT NULL,
 `addr` varchar(128) DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `city` (`city`)
) ENGINE=InnoDB;

比如下面的查询语句

select city,name,age from t where city='杭州' order by name limit 1000;

根据sort_buffer的大小,如果sort_buffer足够大,会把需要返回的字段全部放进去,根据排序字段排完序之后返回

Mysql-一些SQL优化的方式 如果sort_buffer不够大,会把主键id和排序字段放进去,排完序之后再回表,查询到所有业务字段之后再返回

Mysql-一些SQL优化的方式

当然,sort_buffer不够大的情况还会使用到文件排序

按照这个原理,在业务上,经常有根据创建时间,更新时间排序的需求,如果根据创建时间来排序,sort_buffer要放多一个字段,有一个比较鸡贼的办法,创建时间的大小跟id是相差无几的,我们使用id来替换创建时间,当然,这个还是有一定的误差的,不同的业务创建时间的取法是不一样的,需要根据业务来评估

代码分治

将一条复杂sql拆分为多条sql,会提高代码复杂度,消耗测试资源,最好在写需求的时候就识别出来,说得容易,人都有惰性,大家懂的

总结

  1. 单表查询的情况下,慢sql一般都是没加索引或者没用上索引,比如隐式转换、函数计算、模糊查询,使用子查询
  2. 多表查询的情况下,慢sql优化主要思想就是分治,通过代码或者sql的方式尽量缩小范围、分批处理数据
转载自:https://juejin.cn/post/7345300528702324773
评论
请登录