Prometheus(3)_安装AlertManager报警
报警:指prometheus将监测到的异常事件发送给alertmanager,而不是指发送邮件通知 通知:指alertmanager发送异常事件的通知(邮件、webhook等)包括silencing、inhibition,聚合报警信息过后通过email、PagerDuty、HipChat、Slack 等方式发送消息提示
配置 AlertManger:配置报警方式
#alert-cm.yaml kind: ConfigMap apiVersion: v1 metadata: name: alertmanager-config namespace: kube-system data: config.yml: |- global: smtp_smarthost: smtp.163.com:25 #邮箱服务器:此为163邮箱 smtp_from: username@163.com smtp_auth_username: username@163.com smtp_auth_password: "password" #邮箱密码或者客户端授权码 smtp_require_tls: false route: group_by: [alertname] group_wait: 30s group_interval: 5m repeat_interval: 10m receiver: default-receiver receivers: - name: default-receiver email_configs: - to: *************
安装AlertManger
#alert-de.yaml kind: Deployment metadata: labels: name: alertmanager-deployment name: alertmanager namespace: kube-system spec: replicas: 1 selector: matchLabels: app: alertmanager template: metadata: labels: app: alertmanager spec: containers: - name: alertmanager image: prom/alertmanager imagePullPolicy: IfNotPresent env: - name: POD_IP valueFrom: fieldRef: apiVersion: v1 fieldPath: status.podIP args: - "--config.file=/etc/alertmanager/config.yml" #指定alertmanager配置文件路径 - "--storage.path=/alertmanager/data" #指定数据存储路径 - "--cluster.listen-address=$(POD_IP):6783" ports: - containerPort: 9093 name: http volumeMounts: - mountPath: "/etc/alertmanager" name: alertcfg resources: requests: cpu: 100m memory: 256Mi limits: cpu: 100m memory: 256Mi serviceAccountName: prometheus #此处使用prometheus权限 (见prometheus安装文档) volumes: - name: alertcfg configMap: name: alertmanager-config - name: data emptyDir: {}
#alert-svc.yaml #svc暴露端口 --- kind: Service apiVersion: v1 metadata: labels: app: alertmanager name: alertmanager namespace: kube-system spec: type: NodePort ports: - port: 9093 targetPort: 9093 nodePort: 31000 selector: app: alertmanager
配置Prometheus来和AlertManager通信 (添加 prometheus 中prome-cm.yamll)
rule_files: - /etc/prometheus/rules.yml alerting: alertmanagers: - static_configs: - targets: ["SVC_IP:31000"]
Prometheus中创建报警规则(添加 prometheus 中prome-cm.yaml)
rules.yml: | groups: - name: example rules: - alert: InstanceDown expr: up == 0 for: 5m labels: severity: page annotations: summary: "Instance { { $labels.instance }} down" description: "{ { $labels.instance }} of job { { $labels.job }} has been down for more than 5 minutes."
创建
kubectl create -f alert-cm.yaml kubectl create -f alert-de.yaml kubectl create -f alert-svc.yaml #prometheus kubectl apply -f prome-cm.yaml 删除prometheus pod
页面访问:http://node_IP:31000 邮件报警如下:
上一篇:
IDEA上Java项目控制台中文乱码