MYSQL之慢查询分析(Analysis of Slow MySQL Query)
MySQL之慢查询分析
阅读本文可以了解关于MYSQL慢查询分析的两个工具:mysqldumpslow 和 pt-query-digest,掌握这两个工具的使用有助于帮助我们诊断和优化数据库性能问题。
01
概述
-
mysqldumpslow 是 MySQL 自带的一个用于分析慢查询日志的工具;
-
pt-query-digest 是 Percona Toolkit 中的一个重要工具;
-
两者主要都用于分析 MySQL 数据库的慢查询日志,帮助DBA诊断执行效率低下的SQL,从而优化数据库性能。
02
用法和选项
-
mysqldumpslow 工具选项如下:
mysqldumpslow --helpUsage: mysqldumpslow [ OPTS... ] [ LOGS... ]Parse and summarize the MySQL slow query log. Options are--verbose verbose--debug debug--help write this text to standard output-v verbose-d debug-s ORDER what to sort by (al, at, ar, c, l, r, t), 'at' is defaultal: average lock timear: average rows sentat: average query timec: countl: lock timer: rows sentt: query time-r reverse the sort order (largest last instead of first)-t NUM just show the top n queries-a don't abstract all numbers to N and strings to 'S'-n NUM abstract numbers with at least n digits within names-g PATTERN grep: only consider stmts that include this string-h HOSTNAME hostname of db server for *-slow.log filename (can be wildcard),default is '*', i.e. match all-i NAME name of server instance (if using mysql.server startup script)-l don't subtract lock time from total time
-
pt-query-digest 工具的选项如下:
pt-query-digest --helppt-query-digest analyzes MySQL queries from slow, general, and binary log files.It can also analyze queries from C<SHOW PROCESSLIST> and MySQL protocol datafrom tcpdump. By default, queries are grouped by fingerprint and reported indescending order of query time (i.e. the slowest queries first). If no C<FILES>are given, the tool reads C<STDIN>. The optional C<DSN> is used for certainoptions like L<"--since"> and L<"--until">. For more details, please use the--help option, or try 'perldoc/usr/local/percona-toolkit-3.5.4/bin/pt-query-digest' for completedocumentation.Usage: pt-query-digest [OPTIONS] [FILES] [DSN]Options:--ask-pass Prompt for a password when connecting to MySQL--attribute-aliases=a List of attribute|alias, etc (default db|Schema)--attribute-value-limit=i A sanity limit for attribute values (default 0)--charset=s -A Default character set--config=A Read this comma-separated list of config files;if specified, this must be the first option onthe command line--[no]continue-on-error Continue parsing even if there is an error (default yes)--[no]create-history-table Create the --history table if it does not exist (default yes)--[no]create-review-table Create the --review table if it does not exist (default yes)--daemonize Fork to the background and detach from the shell--database=s -D Connect to this database--defaults-file=s -F Only read mysql options from the given file--embedded-attributes=a Two Perl regex patterns to capture pseudo-attributes embedded in queries--expected-range=a Explain items when there are more or fewer thanexpected (default 5,10)--explain=d Run EXPLAIN for the sample query with this DSNand print results--filter=s Discard events for which this Perl code doesn'treturn true--group-by=A Which attribute of the events to group by (default fingerprint)--help Show help and exit--history=d Save metrics for each query class in the giventable. pt-query-digest saves query metrics (query time, lock time, etc.) to this table soyou can see how query classes change over time--host=s -h Connect to host--ignore-attributes=a Do not aggregate these attributes (default arg,cmd, insert_id, ip, port, Thread_id, timestamp,exptime, flags, key, res, val, server_id,offset, end_log_pos, Xid)--inherit-attributes=a If missing, inherit these attributes from thelast event that had them (default db,ts)--interval=f How frequently to poll the processlist, inseconds (default .1)--iterations=i How many times to iterate through the collect-and-report cycle (default 1)--limit=A Limit output to the given percentage or count (default 95%:20)--log=s Print all output to this file when daemonized--max-hostname-length=i Trim host names in reports to this length. 0=Donot trim host names (default 10)--max-line-length=i Trim lines to this length. 0=Do not trim lines (default 74)--order-by=A Sort events by this attribute and aggregatefunction (default Query_time:sum)--outliers=a Report outliers by attribute:percentile:count (default Query_time:1:10)--output=s How to format and print the query analysisresults (default report)--password=s -p Password to use when connecting--pid=s Create the given PID file--port=i -P Port number to use for connection--preserve-embedded-numbers Preserve numbers in database/table names whenfingerprinting queries--processlist=d Poll this DSN's processlist for queries, with --interval sleep between--progress=a Print progress reports to STDERR (default time,30)--read-timeout=m Wait this long for an event from the input; 0 towait forever (default 0). Optional suffix s=seconds, m=minutes, h=hours, d=days; if nosuffix, s is used.--[no]report Print query analysis reports for each --group-byattribute (default yes)--report-all Report all queries, even ones that have beenreviewed--report-format=A Print these sections of the query analysisreport (default rusage,date,hostname,files,header,profile,query_report,prepared)--report-histogram=s Chart the distribution of this attribute'svalues (default Query_time)--resume=s If specified, the tool writes the last fileoffset, if there is one, to the given filename--review=d Save query classes for later review, and don'treport already reviewed classes--run-time=m How long to run for each --iterations. Optionalsuffix s=seconds, m=minutes, h=hours, d=days; ifno suffix, s is used.--run-time-mode=s Set what the value of --run-time operates on (default clock)--sample=i Filter out all but the first N occurrences ofeach query--set-vars=A Set the MySQL variables in this comma-separatedlist of variable=value pairs--show-all=H Show all values for these attributes--since=s Parse only queries newer than this value (parsequeries since this date)--slave-password=s Sets the password to be used to connect to theslaves--slave-user=s Sets the user to be used to connect to the slaves--socket=s -S Socket file to use for connection--timeline Show a timeline of events--type=A The type of input to parse (default slowlog)--until=s Parse only queries older than this value (parsequeries until this date)--user=s -u User for login if not current user--variations=A Report the number of variations in theseattributes' values--version Show version and exit--[no]version-check Check for the latest version of Percona Toolkit,MySQL, and other programs (default yes)--[no]vertical-format Output a trailing "\G" in the reported SQLqueries (default yes)--watch-server=s This option tells pt-query-digest which serverIP address and port (like "10.0.0.1:3306") towatch when parsing tcpdump (for --type tcpdump);all other servers are ignoredOption types: s=string, i=integer, f=float, h/H/a/A=comma-separated list, d=DSN, z=size, m=timeRules:This tool accepts additional command-line arguments. Refer to the SYNOPSIS and usage information for details.DSN syntax is key=value[,key=value...] Allowable DSN keys:KEY COPY MEANING=== ==== =============================================A yes Default character setD yes Default database to use when connecting to MySQLF yes Only read default options from the given fileP yes Port number to use for connectionS yes Socket file to use for connectionh yes Connect to hostp yes Password to use when connectingt no The --review or --history tableu yes User for login if not current userIf the DSN is a bareword, the word is treated as the 'h' key.Options and values after processing arguments:--ask-pass FALSE--attribute-aliases db|Schema--attribute-value-limit 0--charset (No value)--config /etc/percona-toolkit/percona-toolkit.conf,/etc/percona-toolkit/pt-query-digest.conf,/home/mysql/.percona-toolkit.conf,/home/mysql/.pt-query-digest.conf--continue-on-error TRUE--create-history-table TRUE--create-review-table TRUE--daemonize FALSE--database (No value)--defaults-file (No value)--embedded-attributes (No value)--expected-range 5,10--explain (No value)--filter (No value)--group-by fingerprint--help TRUE--history (No value)--host (No value)--ignore-attributes arg,cmd,insert_id,ip,port,Thread_id,timestamp,exptime,flags,key,res,val,server_id,offset,end_log_pos,Xid--inherit-attributes db,ts--interval .1--iterations 1--limit 95%:20--log (No value)--max-hostname-length 10--max-line-length 74--order-by Query_time:sum--outliers Query_time:1:10--output report--password (No value)--pid (No value)--port (No value)--preserve-embedded-numbers FALSE--processlist (No value)--progress time,30--read-timeout 0--report TRUE--report-all FALSE--report-format rusage,date,hostname,files,header,profile,query_report,prepared--report-histogram Query_time--resume (No value)--review (No value)--run-time (No value)--run-time-mode clock--sample (No value)--set-vars--show-all--since (No value)--slave-password (No value)--slave-user (No value)--socket (No value)--timeline FALSE--type slowlog--until (No value)--user (No value)--variations--version FALSE--version-check TRUE--vertical-format TRUE--watch-server (No value)
03
常用技巧
-
mysqldump 常用技巧如下:
-- 预处理慢查询日志,输出指定时间段的日志sed -n '/# Time: 241011 15:44:.../,/# Time: 241011 17:30:.../'p /data/mysql/slow.log-- 输出记录次数最多的10条sql语句mysqldumpslow -s -c -t 10 /data/mysql/slow.log-- 得到返回记录集最多的10条sql语句mysqldumpslow -s -r -t 10 /data/mysql/slow.log-- 得到按照时间排序的前10条里边含有左连接的查询语句mysqldumpslow -s t -t 10 -g "left join" /data/mysql/slow.log参数解析:-s参数:表示按照何种方式排序c、t、l、r 按照记录次数、时间、查询时间、返回的记录数来排序-t参数:即为返回前面多少条数据-g参数:可以写正则匹配模式,大小写不敏感-a参数:说明不合并类似的sql语句,显示具体的sql语句中的数字和字符串-- 输出结果会使用N和S代替SQL中出现的数字和字符串。
-
pt-query-digest 常用技巧如下:
-- 直接分析慢日志文件pt-query-digest [参数] /mysql/log/slow.log > slow_report.log-- 分析最近12h内的查询pt-query-digest --since=12h /mysql/log/slow.log > slow_report.log-- 分析指定时间范围内的查询pt-query-digest --since '2024-10-01 10:00:00' --until '2024-10-02 10:00:00' /mysql/log/slow.log > slow_report.log-- 分析只含有select语句的慢查询pt-query-digest --filter '($event->{fingerprint} || "") =~ m/^select/i' mysql-slow.log > slow_report.log-- 针对某个用户的慢查询pt-query-digest --filter '($event->{user} || "") =~ m/^root/i' mysql-slow.log > slow_report.log-- 查询所有的全表扫描或full join慢查询pt-query-digest --filter '((($event->{Full_scan} || "") eq "yes") || ($event->{Full_join} || "") eq "yes")' mysql-slow.log > slow_report.log-- 把查询结果保存到slowlog_review表pt-query-digest --user=root --password=root --review h=localhost,D=slowlog,t=slowlog_review --create-review-table /mysql/log/slow.log-- 通过tcpdump抓取mysql的tcp协议数据后再用工具分析tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 > mysql.tcp.txtpt-query-digest --type tcpdump mysql.tcp.txt > slow_report.log-- 分析binlogmysqlbinlog mysql-bin.000001 > mysql-bin000001.sqlpt-query-digest --type=binlog mysql-bin000001.sql > slow_report.log-- 分析general.logpt-query-digest --type=genlog general.log > slow_report.log
04
报告解读
-
mysqldumpslow 示例解读如下:
mysqldumpslow -s -c -t 1 slow.txt-- 执行返回结果如下Reading mysql slow query log from slow.txtCount: 1 Time=13.42s (13s) Lock=0.48s (0s) Rows=2.0 (2), root[root]@localhostSELECT * FROM app.bigtableCount:1 代表在解析的日志文件中总共出现了1次;Time=13.42s (13s) 代表该查询累计总耗时13.42s,括号内的13s代表平均时间;Lock=0.48s (0s) 代表这个查询在等待锁上花费了0.48秒的时间,括号内的0s表示平均等待锁的时间是0秒;Rows=20000000.0 (20000000) 代表这个查询返回了20000000条数据,括号内的20000000表示平均每次查询返回的行数也是20000000;root[root]@localhost 代表这个查询是由用户名为 root 的用户在 localhost 主机上执行的。
-
pt-query-digest 分三部分解读,示例如下:
-
第一部分:
-- 用户时间,系统时间,物理内存占用大小,虚拟内存占用大小# 5.6s user time, 280ms system time, 37.48M rss, 122.02M vsz-- 执行工具的时间# Current date: Fri Oct 11 10:17:49 2024-- 主机名# Hostname: HOST01-- 被分析的文件名字# Files: /tmp/slow_statement-20241011.log-- 语句总数量,唯一的语句数量,QPS,并发数# Overall: 10.93k total, 76 unique, 1.02 QPS, 0.02x concurrency __________-- 日志记录时间范围# Time range: 2024-10-11T06:30:43 to 2024-10-11T09:29:59-- 属性 总计 最小 最大 平均 95% 标准 中位数# Attribute total min max avg 95% stddev median# ============ ======= ======= ======= ======= ======= ======= =======# Exec time 245s 82us 30s 22ms 925us 511ms 301us# Lock time 119ms 0 22ms 10us 4us 330us 1us# Rows sent 10.67k 0 897 1.00 3.89 12.07 0.99# Rows examine 13.84M 0 167.37k 1.30k 76.28 10.49k 76.28# Query size 6.17M 42 34.24k 591.87 719.66 413.55 719.66注意:unique 唯一查询数量指的是对查询条件进行参数化以后,总共有多少个不同的查询;95% 指的是所有值从小到大排序,位置位于95%的数median 中位数指的是把所有值从小到大排序,位于中间的数
-
第二部分:
# Profile# Rank Query ID Response time Calls R/Call V/M# ==== =================================== ============== ===== ====== ===# 1 0x64EF0EA126730002088884A136067321 211.0015 86.3% 180 1.1722 12.35# 2 0xE00773AA9237B878155458CDB3CF7750 5.2254 2.1% 28 0.1866 0.00 UPDATE aa# 3 0x3BD914AE329298E799F07AF97F53D5EA 5.1410 2.1% 21 0.2448 0.02 SELECT bb# 4 0x95FAE319F439A358514176137A666198 3.1459 1.3% 7777 0.0004 0.00 SELECT cc# 5 0x934ED5560D4E81AC2C4FCCB4573C899E 3.0805 1.3% 14 0.2200 0.00 SELECT dd# 6 0xD2E066E9DDBCF5016F972A3A6896C311 2.3942 1.0% 1 2.3942 0.00 INSERT SELECT ee# 7 0xD54F51F8D170B47CF1F0BD7262EE63A2 2.3062 0.9% 9 0.2562 0.00 UPDATE ff# 8 0xF0862C304387BB4569628FA57B607BC2 1.7430 0.7% 9 0.1937 0.00 SELECT gg# MISC 0xMISC 10.6013 4.3% 2893 0.0037 0.0 <68 ITEMS>这部分对查询进行参数化并分组,然后对各类查询的执行情况进行分析,结果按总执行时长,从大到小排序。response:总响应时间time:该查询在本次分析中总的时间占比。Calls:执行次数,即本次分析总共有多少条这种类型的查询语句。R/Call:平均每次执行的响应时间。V/M:响应时间Variance-to-mean的比率。Item:查询对象。
-
第三部分:
列出第一个查询的详细统计结果,列出执行次数、最大、最小、平均、95%、标准、中位数的统计。# Query 1: 0.02 QPS, 0.02x concurrency, ID 0x64EF0EA126730002088884A136067321 at byte 8739087# Scores: V/M = 12.35# Time range: 2024-10-11T06:30:43 to 2024-10-11T09:29:47# Attribute pct total min max avg 95% stddev median# ============ === ======= ======= ======= ======= ======= ======= =======# Count 1 180# Exec time 86 211s 9ms 30s 1s 8s 4s 30ms# Lock time 31 37ms 34us 1ms 207us 881us 238us 131us# Rows sent 1 165 0 1 0.92 0.99 0.27 0.99# Rows examine 0 167 0 1 0.93 0.99 0.26 0.99# Query size 0 10.55k 60 60 60 60 0 60# String:# Databases aadb... (158/87%), bbdb (11/6%)... 2 more# Query_time distribution# 1us# 10us# 100us# 1ms ## 10ms ################################################################# 100ms ###### 1s ######## 10s+ ##throttle: 246 'index not used' warning(s) suppressed.\G...Exec time:表示查询的执行时间;Lock time:表示查询在等待锁的时间。Rows sent:表示查询返回的行数。Rows examined:表示查询扫描的行数。Rows affected:表示查询影响的行数。Bytes sent:表示查询发送的字节数。Query size:表示查询的大小。Query_time distribution:查询时间的分布,可以看到这个SQL查询执行时间都是10秒以上。Tables:该SQL查询涉及的表。EXPLAIN:查询的SQL语句。结合上述解析,我们可以解析上述这个案例:QUERY1占所有查询响应时间的86.3%,平均响应时间要1s以上;这个查询涉及到aadb和bbdb两个数据库,其中aadb占比较高;QUERY1中提示throttle: 246 'index not used' warning(s) suppressed.\G,代表检测到246次未使用索引的记录;说明数据库中存在较多未使用索引的sql,建议为对应的业务sql创建合理的索引,以提高数据库性能。
