Setting Up Alerts

Alert configuration is crucial for effective monitoring. This guide walks you through setting up notification channels and creating alert rules using the 9n9s web interface so you’re notified immediately when issues occur.

Overview

9n9s alerting works in two parts:

Notification Channels: Where alerts are sent (email, Slack, PagerDuty, etc.)
Alert Rules: When alerts are triggered and which channels to use

Step 1: Create Notification Channels

Email Notifications

Navigate to Notification Channels
- Log into app.9n9s.com
- Go to Organization Settings in the sidebar
- Click “Notification Channels”
Add Email Channel
- Click “Add Channel” button
- Select “Email” from the channel types
- Fill in the configuration:
  - Channel Name: “Operations Team”
  - Email Address: [email protected]
  - Template: Choose “Detailed” for comprehensive alerts
- Click “Create Channel”

The email channel will now appear in your list and can be used in alert rules.

Slack Integration

Method 1: Slack App (Recommended)

Install Slack App
- In 9n9s, go to Organization Settings > Notification Channels
- Click “Add Channel” → “Slack”
- Click “Install Slack App”
- You’ll be redirected to Slack to authorize the app
- Select the workspace and channels you want 9n9s to access
- Return to 9n9s and your Slack channels will be available
Configure Channel Settings
- Select the Slack channel for notifications
- Choose message formatting options
- Set up any channel-specific preferences
- Click “Create Channel”

Method 2: Webhook

Create Slack Webhook
- In your Slack workspace, go to your app settings
- Create an “Incoming Webhook” integration
- Copy the webhook URL
Add Webhook in 9n9s
- In 9n9s, click “Add Channel” → “Slack Webhook”
- Enter:
  - Channel Name: “Alerts Channel”
  - Webhook URL: Paste your Slack webhook URL
  - Channel: #alerts (optional, for display)
  - Username: 9n9s Monitor (optional)
- Click “Create Channel”

PagerDuty Integration

Get PagerDuty Integration Key
- In PagerDuty, go to Services → Select your service
- Go to Integrations tab
- Click “Add Integration” → “9n9s” (or “Generic API”)
- Copy the Integration Key
Add PagerDuty Channel in 9n9s
- Click “Add Channel” → “PagerDuty”
- Enter:
  - Channel Name: “On-Call Team”
  - Integration Key: Paste your PagerDuty key
  - Routing Key: (if using Events API v2)
- Click “Create Channel”

SMS Notifications

Add SMS Channel
- Click “Add Channel” → “SMS”
- Configure:
  - Channel Name: “Critical Alerts SMS”
  - Provider: Select Twilio
  - Phone Numbers: Add recipient numbers (one per line)
  - From Number: Your Twilio number
- Click “Create Channel”
Twilio Configuration
- You’ll need to provide your Twilio credentials in Organization Settings
- Go to Organization Settings > Integrations > Twilio
- Enter your Account SID and Auth Token

Step 2: Create Alert Rules

Basic Alert Rule for a Specific Monitor

Navigate to Alert Rules
- Go to your project dashboard
- Click “Alert Rules” in the navigation
- Click “Create Alert Rule”
Configure Basic Rule
- Rule Name: “Database Backup Failure”
- Description: “Alert when daily backup fails”
Set Conditions
- Monitor Selection: Choose “Specific Monitors”
- Select your “Daily Database Backup” monitor
- Status Changes: Check “Down” and “Failed”
- Duration: Leave blank for immediate alerts
Configure Actions
- Click “Add Action”
- Notification Channel: Select “Operations Team” (email)
- Delay: 0 minutes (immediate)
- Priority: High
Save Rule
- Review your configuration
- Click “Create Alert Rule”

Tag-Based Alert Rules

Create rules that apply to multiple monitors based on tags:

Create Tag-Based Rule
- Rule Name: “Production Service Alerts”
- Description: “Alert for all production service failures”
Set Tag Conditions
- Monitor Selection: Choose “Monitors with tags”
- Tag Filters:
  - environment equals production
  - criticality equals high
- Status Changes: Select “Down” and “Degraded”
Configure Multiple Actions
- Action 1: Slack alert (immediate)
  - Channel: “Alerts Channel”
  - Delay: 0 minutes
- Action 2: PagerDuty escalation
  - Channel: “On-Call Team”
  - Delay: 0 minutes

Escalation Rules

Set up progressive alerting with increasing urgency:

Create Escalation Rule
- Rule Name: “Critical Service Escalation”
- Monitor Selection: Tags where criticality equals critical
Configure Escalation Chain
- Action 1: Slack notification
  - Delay: 0 minutes
  - Channel: “#alerts”
- Action 2: Email notification
  - Delay: 5 minutes
  - Channel: “Operations Team”
- Action 3: PagerDuty incident
  - Delay: 15 minutes
  - Channel: “On-Call Team”
- Action 4: SMS alerts
  - Delay: 30 minutes
  - Channel: “Critical Alerts SMS”
Set Conditions
- Only escalate if monitor is still down
- Skip escalation if incident is acknowledged

Step 3: Advanced Alert Configuration

Alert Grouping

Prevent alert spam by grouping related alerts:

Enable Grouping in Rule Settings
- In your alert rule, go to Advanced Settings
- Enable “Group similar alerts”
- Grouping Window: 5 minutes
- Maximum Alerts: 5 alerts per group
- Group By: Tags (environment, service)
Deduplication Settings
- Enable “Deduplicate alerts”
- Deduplication Window: 30 minutes
- This prevents the same alert from being sent multiple times

Maintenance Windows

Suppress alerts during planned maintenance:

Schedule Maintenance
- Go to Organization Settings > Maintenance Windows
- Click “Schedule Maintenance”
- Title: “Database Upgrade”
- Start Time: Select date and time
- Duration: 4 hours
- Affected Monitors: Select specific monitors or use tags
Maintenance Settings
- Suppress All Alerts: Yes
- Notify Subscribers: Yes (for status page subscribers)
- Automatic Status: Set monitors to “Maintenance” status

Conditional Alerting

Set up time-based and condition-based alerting:

Business Hours Alerting
- In alert rule settings, enable “Time Windows”
- Active Days: Monday-Friday
- Active Hours: 9:00 AM - 5:00 PM
- Timezone: America/New_York
After-Hours Critical Only
- Create separate rule for after-hours
- Time Windows: Evenings and weekends
- Additional Conditions: Only criticality equals critical
- Different Channels: Direct to on-call team

Step 4: Testing Your Alerts

Test Notification Channels

Test Individual Channels
- Go to Organization Settings > Notification Channels
- Find your channel in the list
- Click the “Test” button next to it
- Enter a test message
- Choose severity level
- Click “Send Test”
Verify Delivery
- Check that the test message arrives
- Verify formatting and content
- Ensure links work correctly

Simulate Monitor Failures

For Heartbeat Monitors:

Temporary Failure Simulation
- Go to your monitor’s detail page
- Click “Simulate” → “Failure”
- This sends a test failure pulse
- Watch for alerts in your channels
Late Pulse Simulation
- Click “Simulate” → “Late”
- This simulates missing a scheduled pulse
- Alerts should trigger after the grace period

For Uptime Monitors:

Edit Monitor Temporarily
- Go to monitor settings
- Change URL to a non-existent endpoint
- Save changes
- Wait for next check to fail
- Restore correct URL afterward
Modify Assertions
- Temporarily change expected status code
- Or modify content assertions to fail
- Monitor next check results

Verify Alert Delivery and Recovery

Check Alert History
- Go to Project Dashboard > Alerts
- View recent alert activity
- Verify alerts were sent to correct channels
Test Recovery Notifications
- After simulating failure, restore monitor to working state
- Verify that recovery notifications are sent
- Check that alert is marked as resolved

Common Alert Patterns

Production Environment Setup

Critical Services
- Monitors: All monitors tagged with criticality=critical
- Immediate Actions: Slack + PagerDuty
- Escalation: SMS after 15 minutes if unacknowledged
Important Services
- Monitors: All monitors tagged with criticality=high
- Actions: Slack immediate, email after 5 minutes
- Business Hours: Different channels for business vs after hours

Development and Staging

Development Environment
- Monitors: All monitors tagged with environment=development
- Actions: Email only, 10-minute delay
- Quiet Hours: No alerts during off-hours
Staging Environment
- Monitors: All monitors tagged with environment=staging
- Actions: Slack alerts during business hours
- Pre-production: Higher priority during testing periods

Team-Based Routing

Backend Team Alerts
- Tag Filter: team=backend or service=api
- Channel: Backend team Slack channel
- Escalation: Team lead email after 10 minutes
Infrastructure Team Alerts
- Tag Filter: team=infrastructure or component=database
- Channel: Infrastructure Slack + email
- Escalation: On-call rotation via PagerDuty

Best Practices

Alert Rule Organization

Use Descriptive Names: “Production API Failures” not “Rule 1”
Add Descriptions: Explain what the rule does and why
Regular Review: Monthly review of alert rules and their effectiveness
Test Regularly: Quarterly testing of alert delivery

Preventing Alert Fatigue

Start Conservative: Begin with critical monitors only
Gradual Expansion: Add more monitors as you tune thresholds
Use Escalation: Not everything needs immediate PagerDuty alerts
Monitor Alert Volume: Review alert frequency in dashboard

Effective Alert Content

Clear Subject Lines: Include service name and status
Actionable Information: Include links to dashboards and runbooks
Context: Environment, duration, affected users
Next Steps: What should the recipient do first?

Troubleshooting Alerts

Common Issues

Alerts Not Being Sent

Check Rule Status: Ensure alert rule is enabled
Verify Conditions: Check if conditions match actual monitor state
Test Channels: Use test feature to verify channel connectivity
Review Maintenance: Check for active maintenance windows

Too Many Alerts

Adjust Thresholds: Increase grace periods for flappy monitors
Enable Grouping: Group similar alerts together
Review Sensitivity: Check if monitors are too sensitive
Use Filtering: Add more specific tag filters

Missing Critical Alerts

Check Coverage: Ensure all critical monitors have alert rules
Verify Tagging: Confirm monitors have correct tags
Test Rules: Use simulation to test rule execution
Review Channels: Ensure notification channels are working

Getting Help

If you’re still having issues with alerts:

Check Status Page: Visit status.9n9s.com for service status
Contact Support: Use the chat widget or email [email protected]
Community: Join our Discord community for peer support
Documentation: Check the Advanced Alerting Reference

CLI Reference

For automation and advanced configurations, you can also manage alerts via CLI:

Next Steps

Once your alerts are configured:

Set up your team: Add team members and configure permissions
Create status pages: Provide transparency to your users
Explore integrations: Connect with your existing tools
Advanced alerting: Learn about complex alert configurations

Effective alerting is the foundation of reliable monitoring. Start with simple rules for your most critical services and gradually expand your alerting strategy as your team’s needs evolve.