Feature
Alerts
Threshold + anomaly rules, per-severity routing, silences, maintenance windows.
A stateless evaluator ticks every 30 seconds, evaluates all rules and deduplicates incidents by fingerprint (rule_id + sample labels). The pending → firing → resolved state machine eliminates flapping. Signed webhooks, Slack, Telegram, email. A retry queue with 30s / 2m / 10m backoff. Per-severity routing with a fallback chain.
Key properties
- ✓Threshold rules + anomaly detection (avg + σ stddev)
- ✓PromQL expressions; AST rewrite injects organization_id
- ✓Per-severity routing: critical and warning to different channels
- ✓Silences (ad-hoc) + maintenance windows (RRULE)
- ✓HMAC-signed webhooks with a ±5min timestamp