近期我司的產品有個局點出現打開頁面刷新出數據很慢的情況網站日志在線分析工具,經過排查發現通過ps -ef查看服務器的MySQL進程CPU占用率很高,比正常情況下高出很多,因此就懷疑是MySQL數據庫的某個查詢操作存在性能問題,接下來想當然的就是去分析MySQL的慢日志了。
開啟慢日志
在MySQL命令模式下,執行下面的命令查看是否已經開啟慢日志及日志存放路徑
SHOW LIKE '%';
-- 執行結果示例,ON表示已開啟慢日志
mysql> SHOW LIKE '%';
+---------------------+-------------------------------+
| | Value |
+---------------------+-------------------------------+
| | ON |
| | /app//mysql/db/slow.log |
+---------------------+-------------------------------+
2 rows in set (0.02 sec)
下面的命令查詢響應時間閾值網站日志在線分析工具,超過設置時間的sql都會被記錄下來,默認值是5秒。
show like '';
mysql> show like '';
+-----------------+----------+
| | Value |
+-----------------+----------+
| | 5. |
+-----------------+----------+
1 row in set (0.02 sec)
在未開啟慢日志的情況下使用下面的命令開啟慢日志
SET =ON;
安裝慢日志分析工具pt-query-
MySQL記錄的慢日志信息會寫入到slow.log文件中,但是直接查看這個文件很不直觀,可以使用pt-query-工具來生成分析報告
# 1. 針對CentOS系統的安裝命令
# 2. 如果提示找不到源,可以到官網下載rpm包手動安裝,另外,還需要安裝相應的依賴
# 3. 網管鏈接:https://www.percona.com/doc/percona-toolkit/LATEST/pt-query-digest.html
yum install percona-toolkit
生成慢日志分析報告
安裝成功之后,執行pt-query- /xxx/xx/slow.log即可生成慢日志分析報告,報告類似于中的AWR報告,完整的分析報告內容參考如下:
[root@xxx-custom-node-07 tools]# pt-query-digest slow.log
...
# ============分析slow.log的執行時間等信息==============
# 730ms user time, 40ms system time, 29.61M rss, 242.69M vsz
# Current date: Mon Jan 10 09:16:35 2022
# Hostname: xxx-custom-node-07
# Files: slow.log
# ================總體統計結果=====================
# Overall: 3.83k total, 2 unique, 0.01 QPS, 0.10x concurrency ____________
# Time range: 2022-01-04T09:05:19 to 2022-01-07T09:10:09
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 25315s 6s 10s 7s 8s 968ms 6s
# Lock time 449ms 0 235us 117us 152us 26us 119us
# Rows sent 0 0 0 0 0 0 0
# Rows examine 40.83G 0 10.92M 10.92M 10.76M 178.10k 10.76M
# Query size 760.73k 6 204 203.39 202.40 4.92 202.40
# ==========查詢分組統計結果,按照查詢時間降序排列==============
# Profile
# Rank Query ID Response time Calls R/Call V
# ==== =================================== ================ ===== ====== =
# 1 0xA6DCD29CE12E34BF78DBFC7612676B48 25309.1943 10... 3829 6.6099 0.14 SELECT sa_asset sa_asset_extra# MISC 0xMISC 5.9563 0.0% 1 5.9563 0.0 <1 ITEMS>
#================每種查詢的詳細統計結果================
# Query 1: 0.01 QPS, 0.10x concurrency, ID 0xA6DCD29CE12E34BF78DBFC7612676B48 at byte 976554
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.14
# Time range: 2022-01-04T09:05:19 to 2022-01-07T09:10:09
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 99 3829
# Exec time 99 25309s 6s 10s 7s 8s 968ms 6s
# Lock time 100 449ms 57us 235us 117us 152us 26us 119us
# Rows sent 0 0 0 0 0 0 0 0
# Rows examine 100 40.83G 10.92M 10.92M 10.92M 10.76M 0 10.76M
# Query size 99 760.72k 198 204 203.44 202.40 3 202.40
# String:
# Databases siem
# Hosts 134.175.220.186
# Users taishi
# Query_time distribution 查詢時間直方圖分布
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+ #
# Tables
# SHOW TABLE STATUS FROM `xxx` LIKE 'sa_asset'\G
# SHOW CREATE TABLE `xxx`.`sa_asset`\G
# SHOW TABLE STATUS FROM `xxx` LIKE 'sa_asset_extra'\G
# SHOW CREATE TABLE `xxx`.`sa_asset_extra`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT
t1.asset_id
FROM
sa_asset t join sa_asset_extra t1 on t1.asset_id = t.assetid
WHERE
t1.param_id in (1,2,3,4,5) and isdelete=0 and
t1.param_value like
CONCAT('%','xx.xx.xx.xx','%')\G
從分析結果可以輕易地看出Query ID為這個查詢sql存在明顯的性能問題,Rows 掃描行所消耗空間有40.83G,響應時間,經過分析SQL涉及的表發現表的總量有千萬級,執行計劃是全表掃描。找到了問題的原因,優化的辦法就很直觀了,總的思路就是優化執行計劃使SQL走索引。