Monitoring & Alerts — Docs

Documentation for the monitoring and alerting features: how alerts are produced and delivered, integration options, runbook guidance, and best practices for on-call usage.

Overview

The monitoring subsystem collects metrics and events from connected bots and components, evaluates configured alert rules, and delivers notifications to configured channels. Designed to be lightweight and pluggable into existing toolchains.

Key capabilities

Alert lifecycle & runbooks

Each alert includes metadata describing source, impacted service, severity and a short runbook. Alerts pass through several stages: fired, acknowledged, resolved — and can be auto-suppressed for maintenance windows.

Typical lifecycle

  1. Rule fired (evaluated by monitor engine).
  2. Alert created with context + suggested runbook.
  3. Delivery to notifications (channels configured for that alert).
  4. Engineer acknowledges & optionally attaches incident notes/post-mortem link.
  5. Alert resolved automatically when rule clears or manually closed.

Integrations & delivery

Choose channel(s) per alert rule. Use webhooks for custom destinations or integrate with common incident management platforms.

Supported channels

Webhook payload example

{
  "alert_id": "a1b2c3",
  "rule": "bot_error_rate_high",
  "severity": "critical",
  "service": "payments-bot",
  "started_at": "2025-07-30T11:42:10Z",
  "metrics": { "error_rate": 12.3, "requests_per_min": 320 },
  "summary": "Error rate > 10% for 5m",
  "runbook_url": "https://your-org/runbooks/bot_error_rate_high",
  "links": { "dashboard": "https://dash.example/metrics/payments-bot" }
}
      

Tuning alerts & best practices

Setup & examples

Quick start: create an alert rule, configure a Slack webhook, and attach a runbook URL.

Example: create a simple rate rule

# pseudo-DSL example
rule "bot_error_rate_high" {
  source = "metrics"
  target = "payments-bot"
  condition = avg(error_rate, "5m") > 10
  severity = "critical"
  notify = ["slack:#ops", "pagerduty:payments"]
  runbook = "https://your-org/runbooks/bot_error_rate_high"
}

Planned & upcoming

FAQ

Q: How do I silence alerts during deploys?
A: Create a maintenance window in the UI and add the affected services. Alerts that match the window and tags are automatically suppressed.
Q: Can I route different severities to different channels?
A: Yes. Alert routing supports per-rule channel lists and severity filters.

Contact & support

If you need help with integration, migration, or enterprise features (on-prem, SSO), contact lmsmanager@outlook.com.

Monitoring overview Contact support