修改prometheus配置文件
[root@devops-prometheus ecs-user]# vim /usr/local/prometheus/prometheus.yml
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- 10.1.0.157:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- "first_rules.yml"
# - "second_rules.yml"

配置规则
[root@devops-prometheus prometheus]# pwd
/usr/local/prometheus
[root@devops-prometheus prometheus]# vim first_rules.yml
groups:
- name: Container CPU usage
rules:
- alert: ContainerCpuUsage
expr: (sum(rate(container_cpu_usage_seconds_total{name!=""}[3m])) BY (instance, name) * 100) > 80
for: 2m
labels:
severity: warning
annotations:
summary: Container CPU usage (instance {{ $labels.instance }})
description: "Container CPU usage is above 80%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
检查规则有没有问题
[root@devops-prometheus prometheus]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: 1 rule files found SUCCESS: prometheus.yml is valid prometheus config file syntax Checking first_rules.yml SUCCESS: 1 rules found
重启服务
[root@devops-prometheus prometheus]# systemctl restart prometheus

- 更多现成规则https://awesome-prometheus-alerts.grep.to/
继续阅读












评论