PostgreSQL死锁排查攻略:从日志分析到实时监控
数据库死锁是后端开发者和DBA经常遇到的棘手问题,本文将手把手教你如何排查和解决PostgreSQL中的死锁问题。
⚠️ PostgreSQL 16默认会记录死锁吗?
答案是不会! 😱
虽然PostgreSQL 16具备死锁检测机制(在等待锁超过deadlock_timeout后会自动检测并解决死锁),但默认不会将死锁的详细信息记录到日志中。这就意味着你知道发生了死锁,却不知道具体原因!
🔧 如何配置死锁日志记录
1. 修改配置文件
找到PostgreSQL数据目录下的postgresql.conf文件,添加以下配置:
# 记录锁等待信息(关键!)
log_lock_waits = on# 死锁检测超时时间(默认1秒)
deadlock_timeout = 1s# 日志级别至少设置为log
log_min_messages = log# 详细的日志前缀
log_line_prefix = '%t [%p]: db=%d,user=%u,app=%a,client=%h '2. 重新加载配置
-- 在psql中执行
SELECT pg_reload_conf();或者使用命令行:
pg_ctl reload -D /path/to/your/data/directory📋 死锁日志分析实例
配置完成后,当死锁发生时,你会在日志中看到类似这样的详细信息:
2025-11-02 10:23:41.123 CST [12345]: LOG: 📝 Powered by Moshow 郑锴 | 更多技术干货:https://zhengkai.blog.csdn.net
2025-11-02 10:23:41.123 CST [12345]: LOG: process 12345 detected deadlock while waiting for ShareLock on transaction 123456 after 1000.123 ms
2025-11-02 10:23:41.123 CST [12345]: DETAIL: Process holding the lock: 12346. Wait queue: .
2025-11-02 10:23:41.123 CST [12345]: PROCESS 12345: 等待事务 123456 的 ShareLock; 被进程 12346 阻塞.
2025-11-02 10:23:41.123 CST [12345]: PROCESS 12345: 执行语句: UPDATE accounts SET balance = balance - 100.00 WHERE user_id = 1;
2025-11-02 10:23:41.123 CST [12346]: PROCESS 12346: 等待事务 123457 的 ShareLock; 被进程 12345 阻塞.
2025-11-02 10:23:41.123 CST [12346]: PROCESS 12346: 执行语句: UPDATE accounts SET balance = balance + 50.00 WHERE user_id = 2;
2025-11-02 10:23:41.123 CST [12345]: ERROR: deadlock detected🔍 如何解读这个日志:
涉及进程:进程12345和12346
死锁场景:两个进程互相等待对方释放锁
执行的SQL:两个UPDATE语句在竞争相同的资源
解决方案:PostgreSQL选择中止进程12345的事务
温和终止(优先尝试): SELECT pg_terminate_backend(12345);
强制终止(若温和方式失败): SELECT pg_cancel_backend(12345);
🖥️ 实时监控:系统视图分析
除了查看日志,你还可以实时监控当前的锁等待情况:
强大的锁等待查询语句
--📝 Powered by Moshow 郑锴 | 更多技术干货:https://zhengkai.blog.csdn.net
SELECTblocked_locks.pid AS blocked_pid,blocked_activity.usename AS blocked_user,blocking_locks.pid AS blocking_pid,blocking_activity.usename AS blocking_user,blocked_activity.query AS blocked_statement,blocking_activity.query AS current_statement_in_blocking_process,blocked_activity.application_name AS blocked_application,blocking_activity.application_name AS blocking_application
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktypeAND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASEAND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relationAND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.pageAND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tupleAND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxidAND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionidAND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classidAND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objidAND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubidAND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;📊 查询结果示例:
| blocked_pid | blocked_user | blocking_pid | blocking_user | blocked_statement |
|---|---|---|---|---|
| 12345 | app_user | 12346 | app_user | UPDATE accounts SET balance = balance - 100 WHERE user_id = 1 |
| 12347 | web_user | 12348 | batch_user | DELETE FROM orders WHERE status = 'cancelled' |
这个查询能帮你:
✅ 实时发现阻塞情况
✅ 识别阻塞的源头
✅ 看到具体的阻塞SQL语句
✅ 在死锁发生前进行干预
💡 最佳实践建议
📝 生产环境务必配置日志:
log_lock_waits = on是你的生命线⏱️ 合理设置超时:
deadlock_timeout保持默认1秒即可🔍 定期检查日志:关注
pg_stat_database中死锁计数器的变化🛠️ 代码层面预防:确保事务中的SQL操作顺序一致
👀 实时监控:使用系统视图查询作为辅助诊断工具
预防胜于治疗!通过合理的应用设计和数据库配置,可以大大减少死锁的发生频率。🚀

