当前位置: 首页 > news >正文

一键部署 Prometheus + Grafana + Alertmanager 教程(使用 Docker Compose)

1. 安装前准备

确保你已安装以下组件:

# 安装 Docker
curl -fsSL https://get.docker.com | bash# 安装 Docker Compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" \-o /usr/local/bin/docker-composesudo chmod +x /usr/local/bin/docker-compose
docker-compose --version

2. 创建项目目录结构

mkdir -p ~/prometheus-stack
cd ~/prometheus-stackmkdir -p prometheus
mkdir -p alertmanager

3. 创建配置文件

nano docker-compose.yml

粘贴以下内容:

version: '3.3'services:prometheus:image: prom/prometheuscontainer_name: prometheusports:- "9090:9090"volumes:- ./prometheus.yml:/etc/prometheus/prometheus.yml- ./prometheus:/etc/prometheus/rulesrestart: unless-stoppedgrafana:image: grafana/grafana-osscontainer_name: grafanaports:- "3001:3000"environment:- GF_SECURITY_ADMIN_USER=admin- GF_SECURITY_ADMIN_PASSWORD=adminvolumes:- grafana-storage:/var/lib/grafanarestart: unless-stoppedalertmanager:image: prom/alertmanagercontainer_name: alertmanagerports:- "9093:9093"volumes:- ./alertmanager.yml:/etc/alertmanager/alertmanager.ymlcommand:- '--config.file=/etc/alertmanager/alertmanager.yml'restart: unless-stoppedvolumes:grafana-storage:
nano prometheus.yml

内容如下

# my global config
global:scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# scrape_timeout is set to the global default (10s).# Alertmanager configuration
alerting:alertmanagers:- static_configs:- targets:- "xxxxx:9093"# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:- "/etc/prometheus/rules/*.yml"# - "first_rules.yml"# - "second_rules.yml"# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.- job_name: "prometheus"# metrics_path defaults to '/metrics'# scheme defaults to 'http'.static_configs:- targets: ["localhost:9090"]# The label name is added as a label `label_name=<label_value>` to any timeseries scraped from this config.labels:app: "prometheus"- job_name: "agent_windows"static_configs:- targets: ["xxxx:9182"]labels:app: "windows"instance: "虚拟机"- job_name: "alertmanager"static_configs:- targets: ["localhost:9093"]labels:app: "alertmanager"- job_name: "node_exporter"static_configs:- targets: ["xxxx:9100"]labels:app: "node_exporter"instance: "阿里云测试服务器"- job_name: "process_exporter"static_configs:- targets: ["xxxx:9256"]labels:app: "process"instance: "阿里云测试服务器"- job_name: 'http_probe'metrics_path: /probeparams:module: [http_2xx]static_configs:- targets:- https:XXXXXXlabels:instance: "公司主页"app: webapp- targets:- http://xxxxxxxx:9997/docs#/labels:instance: "PDF合并"app: pdfservicerelabel_configs:- source_labels: [__address__]target_label: __param_target#- source_labels: [__param_target]#  target_label: url   # 保留 static_configs 里的 instance 标签,不覆盖# 也可以注释掉下一行,避免覆盖instance标签# - source_labels: [__param_target]#   target_label: instance- target_label: __address__replacement: xxxxxxx:9115
nano alertmanager.yml
route:group_by: ['alertname'] # 按 alertname 标签分组告警(相同告警合并通知)group_wait: 30s # 第一次告警延迟30秒再发送,防止太快触发group_interval: 5m  # 同一组告警发送间隔至少5分钟(防止频繁通知)repeat_interval: 1h # 告警持续存在,重复通知间隔1小时receiver: 'webhook_receiver'  # 默认发送接收器名称receivers:- name: 'webhook_receiver'webhook_configs:- url: 'http://xxxx:5012/alertmanager_to_feishu'send_resolved: true inhibit_rules:- source_match:severity: 'critical'target_match:severity: 'warning'equal: ['alertname', 'dev', 'instance']

剩下的配置文件就可以添加到prometheus/文件夹下

4. 启动服务

docker-compose up -d

http://www.dtcms.com/a/280278.html

相关文章:

  • sublime如何支持换行替换换行
  • HTTP性能优化实战技术
  • 一键直达人口分布数据
  • 606. 二叉树创建字符串
  • AutoGPT vs BabyAGI:自主任务执行框架对比与选型深度分析
  • Product Hunt 每日热榜 | 2025-07-15
  • 链表算法之【回文链表】
  • 药品挂网价、药品集采价格、药品上市价格一键查询!
  • 多租户SaaS系统中设计安全便捷的跨租户流程共享
  • PubSub is not defined
  • PyCharm 高效入门指南:从安装到效率倍增
  • Spark Expression codegen
  • 用TensorFlow进行逻辑回归(六)
  • Spark 之 Join BoundCondition
  • windows内核研究(进程与线程-KPCR)
  • C++题解(37) 信息学奥赛一本通1318:【例5.3】自然数的拆分
  • 【GEOS-Chem模型第一期】模型概述及网页总结
  • 网络基础10--ACL与包过滤
  • C++11:constexpr 编译期性质
  • MySQL 备份与恢复指南
  • 【PTA数据结构 | C语言版】二叉树层序序列化
  • 【UV环境】使用uv快速创建环境
  • RocketMq 启动_源码分析
  • java 在k8s中的部署流程
  • LeetCode--46.全排列
  • 机器人位姿变换的坐标系相对性:左乘法则与右乘法则解析​
  • Kafka与Flink打造流式数据采集方案:以二手房信息为例
  • 如何把手机ip地址切换到外省
  • 【机器人】REGNav 具身导航 | 跨房间引导 | 图像目标导航 AAAI 2025
  • 用TensorFlow进行逻辑回归(五)