Docker状况监控
Docker容器监控
介绍
Portainer
Dokcer的轻量可视化面板,官方网站 Portainer - Docker可视化管理面板,这里只作为了解学习一下
步骤:
docker pull 6053537/portainer-ce
docker-compose.yaml
version: "2"
services:portainer:image: 6053537/portainer-ce:latestcontainer_name: portainerrestart: alwaysports:- 9000:9000volumes:- /mydata/monitorToDocker/portainer/data:/data- /var/run/docker.sock:/var/run/docker.sock
启动,然后通过9000
端口进行访问即可
CAdvisor
CAdvisor可以监控容器的内存、CPU、网络IO、磁盘IO,默认只存储2分钟的数据,可以将监控数据保存到 InfluxDB、Elasticsearch
等,缺陷是不支持多主机监控,且不具有告警的能力
拉取镜像 gcr.io/cadvisor 项目中国可用镜像列表 | 高速可靠的 Docker 镜像资源
docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/gcr.io/cadvisor/cadvisor:v0.52.1
docker-compose.yaml
version: "2"
services:cadvisor:image: swr.cn-north-4.myhuaweicloud.com/ddn-k8s/gcr.io/cadvisor/cadvisor:v0.52.1container_name: cadvisorvolumes:- /:/rootfs:ro- /var/run:/var/run:ro- /sys:/sys:ro- /var/lib/docker/:/var/lib/docker:ro- /dev/disk/:/dev/disk:rorestart: alwaysprivileged: truedevices:- /dev/kmsg:/dev/kmsgports:- 8083:8080
启动,并访问8083端口
InfluxDB
InfluxDB是一个开源的分布式时序数据库,非常适合保存监控日志
学习网址:入门指南 · InfluxDB中文文档
拉取镜像
docker pull influxdb
docker-compose.yaml
version: "2"
services:
influxdb:image: influxdb:latestcontainer_name: influxdbports:- 8086:8086restart: always
启动,访问8086端口
Grafana
Grafana是可视化面板,并且具有警报功能
学习网站:[Grafana 入门指南 - 可观测中文社区](https://observability.cn/project/grafana/xxng9rfwgbvnpwq4/#_top)
拉取镜像
docker pull grafana/grafana
docker-compose.yaml
version: "2"
services:grafana:image: grafana/grafana:latestcontainer_name: grafanaports:- 3000:3000restart: on-failure
Prometheus
学习网站:入门 | Prometheus - Prometheus 监控系统
拉取镜像
docker pull prom/prometheus
docker-compose.yaml
version: "2"
services:prometheus:container_name: prometheusimage: prom/prometheus:latestvolumes:- /mydata/monitorToDocker/prometheus/prometheus.yaml:/etc/prometheus/prometheus.ymlports:- 9090:9090restart: on-failure
对于更详细的配置可以查看Prometheus 配置文件详解 - kevin.Xiang - 博客园
prometheus.yaml
my global config
global:scrape_interval: 15s # 将间隔设置为每15秒一次。默认是每1分钟一次。scrape_configs:- job_name: 'prometheus'static_configs:- targets: ['localhost:9090']
监控系统
CAdvisor + Prometheus + Grafana
CAdvisor负责采集Docker的相关数据,Prometheus 从 CAdvisor定时收集对应的数据,Grafana负责将Prometheus收集的数据进行可视化
docker-compose.yaml
version: "2"
services:grafana:image: grafana/grafana:latestcontainer_name: grafanaports:- 3000:3000restart: on-failuredepends_on:- prometheuscadvisor:image: swr.cn-north-4.myhuaweicloud.com/ddn-k8s/gcr.io/cadvisor/cadvisor:v0.47.0container_name: cadvisorvolumes:- /:/rootfs:ro- /var/run:/var/run:ro- /sys:/sys:ro- /var/lib/docker/:/var/lib/docker:ro- /dev/disk/:/dev/disk:rorestart: alwaysprivileged: truedevices:- /dev/kmsg:/dev/kmsgports:- 8083:8080prometheus:container_name: prometheusimage: prom/prometheus:latestvolumes:- /mydata/monitorToDocker/prometheus/prometheus.yaml:/etc/prometheus/prometheus.ymlports:- 9090:9090
promethues.yaml
global:scrape_interval: 15s # 15秒采集一次evaluation_interval: 15s # 每15秒计算一次scrape_configs:- job_name: 'prometheus'static_configs:- targets: ['localhost:9090']- job_name: 'cadvisor'static_configs:- targets: ['cadvisor:8080'] # cadvisor是我定义的容器名
启动并访问 9090
端口,选择 Status -> Targets
,查看节点状态
访问3000端口,先配置Connections -> Data Sources,再配置Dashboards,这里使用Grafana
的Dashboards
时候可能出现版本不兼容的情况,这里我使用ID为193
的模板