Prometheus - Alertmanager - OpsGenie 연동 (+ Grafana)
테스트 환경
Prometheus(container)
- Image: prom/prometheus
- Version: 2.36.0
Alertmanager(container)
- Image: prom/alertmanager
- Version: 0.24.0
OpsGenie
- Team Name: MSP_Alert_Test
- Team Memver: sker
- Integrations: Prometheus, Slack
1. OpsGenie에 Team을 생성합니다.
- Team은 OpsGenie에 존재하는 계정을 기반으로 합니다.

2. 팀 메뉴 내 좌측 "Integrations"탭에서 Prometheus 및 Slack을 추가합니다.
- Prometheus는 API Key를 Alertmanager Config 파일에 입력해야하니 복사해둡시다.

- Slack은 "Add to Slack" 버튼을 누르면 웹 로그인 후 알림을 게시할 채널을 선택할 수 있습니다.

3. Prometheus API Key를 Alertmanager Config에 삽입합니다.
# alertmanager-config.yml
global:
opsgenie_api_key: <여기에 삽입>
route:
group_by: [alertname]
group_wait: 15s
group_interval: 1m
receiver: opsgenie-alert
receivers:
- name: 'opsgenie-alert'
opsgenie_configs:
- send_resolved: true
message: '{{ range .Alerts }}{{ .Annotations.title }}{{ end }}'
description: '{{ range .Alerts }}{{ .Annotations.text }}{{ end }}'
priority: '{{ range .Alerts }}{{ .Labels.priority }}{{ end }}'
responders:
- id: MSP_Alert_Test
type: team
결과
- Slack

- OpsGenie

! 여기까지가 단순 Prometheus - OpsGenie간 알림 설정입니다.
- 아래는 Prometheus, Alertmanager 구성 파일입니다.
- prometheus.yml
global:
scrape_interval: 15s # 데이터 스크래핑 주기
evaluation_interval: 15s # 규칙 평가 주기
external_labels:
monitor: 'monitoring-test'
# Alert을 위한 prometheus rule
rule_files:
- /etc/prometheus/alert-rules.yml
# Alertmanager Config
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets: ['alertmanager:9093'] # docker-compose로 사용시 host = docker-compose.service
# exporter 엔드포인트 및 라벨 지정
scrape_configs:
- job_name: 'monitoring-ec2'
scrape_interval: 5s
static_configs:
- targets: ['13.209.74.166:9100']
- alert-rules.yml
groups:
- name: Alerts
rules:
- alert: CPU 80%
# 15초 동안의 CPU 사용량을 백분율로 계산, 1분 동안 지속시 알림
expr: 100 - ((irate(node_cpu_seconds_total{mode="idle"}[15s])) * 100) >= 80
for: 1m
labels:
serverity: critical
priority: P1
annotations:
title: "[WARNNING] {{ $labels.job }} CPU 80%"
text: |
Name: {{ $labels.job }}
IP: {{ $labels.instance }}
- alertmanager-config.yml
global:
opsgenie_api_key: <secret>
route:
group_by: [alertname]
group_wait: 15s
group_interval: 1m
receiver: opsgenie-alert
receivers:
- name: 'opsgenie-alert'
opsgenie_configs:
- send_resolved: true # default: true
message: '{{ range .Alerts }}{{ .Annotations.title }}{{ end }}' # default: '{{ template "opsgenie.default.message" . }}'
description: '{{ range .Alerts }}{{ .Annotations.text }}{{ end }}' # default: '{{ template "opsgenie.default.description" . }}'
priority: '{{ range .Alerts }}{{ .Labels.priority }}{{ end }}' # P1, P2, P3, P4, P5
responders:
- id: MSP_Alert_Test
type: team
!! 이 아래는 서비스를 구성하는 docker-compose 파일입니다.
grafana 및 mysql 이미지까지 섞여있습니다.
구조: prometheus, alertmanager, grafana, mysql(for grafana)
- docker-compose.yaml
version: "3"
services:
# Prometheus (Port: 9090, Host: prometheus)
prometheus:
image: prom/prometheus
container_name: prometheus
volumes:
- /Users/zlcus/tmp/prometheus/dir/conf_prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
- /Users/zlcus/tmp/prometheus/dir/conf_prometheus/alert-rules.yml:/etc/prometheus/alert-rules.yml:ro
- /Users/zlcus/tmp/prometheus/dir/conf_prometheus/data:/data:rw
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/data' # Prometheus 메트릭 저장 Path
- '--storage.tsdb.retention.time=37d' # 데이터 보관주기
restart: always
ports:
- "0.0.0.0:9090:9090"
# Alert Manager (Port: 9093, Host: alertmanager)
alertmanager:
image: prom/alertmanager
container_name: alertmanager
volumes:
- /Users/zlcus/tmp/prometheus/dir/conf_alertmanager/config.yml:/etc/alertmanager/config.yml:ro
command:
- '--config.file=/etc/alertmanager/config.yml'
- '--storage.path=/alertmanager'
restart: always
ports:
- "9093:9093"
# MySQL (Port: 3306, Host: grafanadb)
grafanadb:
image: mysql:5.6
platform: linux/x86_64
container_name: mysql
ports:
- 3306:3306
volumes:
- /Users/zlcus/tmp/prometheus/dir/conf_mysql/data:/var/lib/mysql:rw
- /Users/zlcus/tmp/prometheus/dir/conf_mysql/my.cnf:/etc/mysql/my.cnf:ro
- /Users/zlcus/tmp/prometheus/dir/conf_mysql/mysqld.cnf:/etc/mysql/mysql.conf.d/mysqld.cnf:ro
environment:
- MYSQL_DATABASE=grafana
- MYSQL_USER=grafanaadmin
- MYSQL_PASSWORD=grafanaAdmin1!
- MYSQL_RANDOM_ROOT_PASSWORD=1
restart: always
# Grafana (Port: 8000, Host: grafana)
grafana:
image: grafana/grafana
container_name: grafana
ports:
- "0.0.0.0:8000:3000"
volumes:
# - /Users/zlcus/tmp/prometheus/dir/conf_grafana/ssl:/etc/ssl/certs
- /Users/zlcus/tmp/prometheus/dir/conf_grafana/grafana.ini:/config_files/grafana.ini:ro
- /Users/zlcus/tmp/prometheus/dir/conf_grafana/home.json:/usr/share/grafana/conf/provisioning/dashboards/home.json:ro
- /Users/zlcus/tmp/prometheus/dir/conf_grafana/datasource.yml:/usr/share/grafana/conf/provisioning/datasources/datasource.yml:ro
environment:
- GF_PATHS_CONFIG=/config_files/grafana.ini
restart: always
depends_on:
- grafanadb
- prometheus
# User Define (https://grafana.com/docs/grafana/latest/installation/docker/#migrate-to-v51-or-later)
# user: '472'
# ADD Permission
privileged: true