Creating Your First Monitor

Creating your first monitor in 9n9s is designed to be quick and straightforward. This guide will walk you through the entire process, from choosing the right monitor type to receiving your first successful pulse or check result.

Before You Begin

Prerequisites:

✅ 9n9s account created and verified
✅ Organization and first project set up
✅ Basic understanding of your monitoring needs

Choose Your Monitor Type:

Heartbeat Monitor: For processes that can actively signal (cron jobs, scripts, applications)
Uptime Monitor: For checking external endpoints (websites, APIs, services)

If you’re unsure which type to use, check our Monitor Types Guide for detailed comparisons.

Creating a Heartbeat Monitor

Heartbeat monitors are perfect for monitoring scheduled tasks, background jobs, and any process that can send an HTTP request.

Step 1: Create the Monitor

Via Web Interface:

Navigate to Your Project
- Log into your 9n9s dashboard
- Select your organization and project
- Click “Add Monitor” or “Create Monitor”
Choose Monitor Type
- Select “Heartbeat Monitor”
- This will open the heartbeat configuration form

Basic Configuration

Monitor Name: "Daily Database Backup"
Description: "Monitors the daily PostgreSQL backup job"

Schedule Settings

Schedule Type: Cron Expression
Cron Expression: 0 2 * * *  (Daily at 2 AM)
Timezone: UTC (or your preferred timezone)
Grace Period: 30 minutes

Optional Advanced Settings

Expected Runtime:
- Minimum: 5 minutes
- Maximum: 60 minutes

Tags:
- environment: production
- service: database
- criticality: high

Create Monitor
- Click “Create Monitor”
- Note the unique Monitor ID (e.g., mon_abc123xyz)
- Copy the pulse endpoint URL

Via CLI:

# Create a basic heartbeat monitor
9n9s-cli heartbeat create \
  --name "Daily Database Backup" \
  --description "Monitors the daily PostgreSQL backup job" \
  --schedule "0 2 * * *" \
  --grace-period 1800 \
  --timezone "UTC" \
  --project-id "proj_your_project_id"

# Create with tags and expected runtime
9n9s-cli heartbeat create \
  --name "Daily Database Backup" \
  --schedule "0 2 * * *" \
  --grace-period 1800 \
  --expected-runtime-min 300 \
  --expected-runtime-max 3600 \
  --tags "environment:production,service:database,criticality:high"

Step 2: Integrate with Your Process

Now you need to modify your process to send signals to 9n9s. Here are examples for common scenarios:

Shell Script Example:

#!/bin/bash
# backup.sh - PostgreSQL backup script

# Your monitor pulse endpoint
PULSE_URL="https://pulse.9n9s.com/mon_abc123xyz"

# Signal that backup is starting
curl -fsS "$PULSE_URL/start"

# Perform the backup
echo "Starting database backup..."
if pg_dump -h localhost -U backup_user -d production_db -f "/backups/backup_$(date +%Y%m%d_%H%M%S).sql"; then
    # Backup succeeded
    echo "Backup completed successfully"

    # Send success signal with details
    BACKUP_SIZE=$(du -h /backups/backup_*.sql | tail -1 | cut -f1)
    curl -fsS -X POST \
      -H "Content-Type: application/json" \
      -d "{\"message\": \"Backup completed successfully\", \"backup_size\": \"$BACKUP_SIZE\"}" \
      "$PULSE_URL"
else
    # Backup failed
    echo "Backup failed!"

    # Send failure signal
    curl -fsS -X POST \
      -H "Content-Type: application/json" \
      -d "{\"error\": \"pg_dump failed with exit code $?\", \"timestamp\": \"$(date -Iseconds)\"}" \
      "$PULSE_URL/fail"

    exit 1
fi

Python Script Example:

#!/usr/bin/env python3
# backup.py - Database backup with monitoring

import subprocess
import requests
import json
import datetime
import sys

# Your monitor pulse endpoint
PULSE_URL = "https://pulse.9n9s.com/mon_abc123xyz"

def send_pulse(action=None, data=None):
    """Send a pulse to 9n9s"""
    url = PULSE_URL
    if action:
        url += f"/{action}"

    try:
        if data:
            response = requests.post(url, json=data, timeout=10)
        else:
            response = requests.get(url, timeout=10)
        response.raise_for_status()
        print(f"✅ Pulse sent successfully: {action or 'success'}")
    except requests.RequestException as e:
        print(f"⚠️  Failed to send pulse: {e}")

def main():
    # Signal start
    send_pulse("start")

    try:
        # Perform backup
        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
        backup_file = f"/backups/backup_{timestamp}.sql"

        print("Starting database backup...")
        result = subprocess.run([
            "pg_dump",
            "-h", "localhost",
            "-U", "backup_user",
            "-d", "production_db",
            "-f", backup_file
        ], check=True, capture_output=True, text=True)

        # Calculate backup size
        import os
        backup_size_mb = os.path.getsize(backup_file) / (1024 * 1024)

        # Send success pulse with details
        send_pulse(data={
            "message": "Backup completed successfully",
            "backup_file": backup_file,
            "backup_size_mb": round(backup_size_mb, 2),
            "duration_seconds": 300  # You could measure actual duration
        })

        print(f"✅ Backup completed: {backup_file} ({backup_size_mb:.2f} MB)")

    except subprocess.CalledProcessError as e:
        # Send failure pulse
        send_pulse("fail", {
            "error": f"pg_dump failed with exit code {e.returncode}",
            "stderr": e.stderr,
            "timestamp": datetime.datetime.now().isoformat()
        })
        print(f"❌ Backup failed: {e}")
        sys.exit(1)
    except Exception as e:
        # Send failure pulse for unexpected errors
        send_pulse("fail", {
            "error": f"Unexpected error: {str(e)}",
            "error_type": type(e).__name__,
            "timestamp": datetime.datetime.now().isoformat()
        })
        print(f"❌ Unexpected error: {e}")
        sys.exit(1)

if __name__ == "__main__":
    main()

Cron Job Setup:

# Add to your crontab (crontab -e)
# Run backup daily at 2 AM
0 2 * * * /path/to/backup.sh >> /var/log/backup.log 2>&1

Step 3: Test Your Monitor

Manual Test:

# Test the pulse endpoint manually
curl -fsS "https://pulse.9n9s.com/mon_abc123xyz"

# Test with payload data
curl -fsS -X POST \
  -H "Content-Type: application/json" \
  -d '{"test": true, "message": "Manual test pulse"}' \
  "https://pulse.9n9s.com/mon_abc123xyz"

# Test failure scenario
curl -fsS -X POST \
  -H "Content-Type: application/json" \
  -d '{"test": true, "error": "Simulated failure"}' \
  "https://pulse.9n9s.com/mon_abc123xyz/fail"

Check Monitor Status:

# View monitor status via CLI
9n9s-cli heartbeat get mon_abc123xyz

# View recent pulse logs
9n9s-cli heartbeat logs mon_abc123xyz --limit 10

Creating an Uptime Monitor

Uptime monitors continuously check external endpoints to ensure they’re responding correctly.

Step 1: Create the Monitor

Via Web Interface:

Navigate to Your Project
- Select “Add Monitor” → “Uptime Monitor”

Basic Configuration

Monitor Name: "API Health Check"
Description: "Monitors the main API endpoint"
URL: https://api.example.com/health

Check Configuration

HTTP Method: GET
Check Frequency: Every 1 minute
Timeout: 10 seconds

Assertions (Success Criteria)

Status Code: 200
Response Time: Less than 2000ms
Response Body Contains: "status": "healthy"

Optional Settings

Custom Headers:
- Authorization: Bearer your-api-token
- User-Agent: 9n9s-Monitor/1.0

Tags:
- service: api
- environment: production
- criticality: high

Via CLI:

# Create a basic uptime monitor
9n9s-cli uptime create \
  --name "API Health Check" \
  --url "https://api.example.com/health" \
  --frequency 60 \
  --timeout 10 \
  --project-id "proj_your_project_id"

# Create with assertions and headers
9n9s-cli uptime create \
  --name "API Health Check" \
  --url "https://api.example.com/health" \
  --frequency 60 \
  --headers "Authorization:Bearer token123" \
  --assert-status 200 \
  --assert-response-time 2000 \
  --assert-content '"status":"healthy"' \
  --tags "service:api,environment:production"

Step 2: Configure Advanced Assertions

For more complex validation, you can configure multiple assertions:

Status Code Validation:

assertions:
    - type: STATUS_CODE
      operator: EQUALS
      value: 200

Response Time Validation:

assertions:
    - type: RESPONSE_TIME
      operator: LESS_THAN
      value: 2000 # milliseconds

Content Validation:

assertions:
    - type: RESPONSE_BODY
      operator: CONTAINS
      value: '"status":"healthy"'

    - type: JSON_CONTENT
      path: "$.status"
      operator: EQUALS
      value: "healthy"

Header Validation:

assertions:
    - type: RESPONSE_HEADER
      header: "Content-Type"
      operator: CONTAINS
      value: "application/json"

Step 3: Test Your Monitor

Manual Check:

# Trigger a manual check
9n9s-cli uptime check mon_xyz456abc

# View check results
9n9s-cli uptime get mon_xyz456abc --include-recent-checks

# View check history
9n9s-cli uptime logs mon_xyz456abc --limit 5

Verify Endpoint Manually:

# Test the endpoint yourself
curl -v -H "Authorization: Bearer your-token" \
  https://api.example.com/health

# Time the response
time curl -s https://api.example.com/health > /dev/null

Setting Up Your First Alert

Once your monitor is created, you’ll want to be notified when issues occur.

Step 1: Configure Notification Channel

Email Notification:

# Add email notification channel
9n9s-cli notification-channel create \
  --type email \
  --name "Operations Team" \
  --config '{"email": "[email protected]"}'

Slack Notification:

# Add Slack webhook notification
9n9s-cli notification-channel create \
  --type slack \
  --name "Alerts Channel" \
  --config '{"webhook_url": "https://hooks.slack.com/services/..."}'

Step 2: Create Alert Rule

Basic Alert Rule:

# Create alert rule for your monitor
9n9s-cli alert-rule create \
  --name "Database Backup Alerts" \
  --project-id "proj_your_project_id" \
  --conditions "monitor_ids:mon_abc123xyz,status_changes:DOWN" \
  --actions "channel_id:chan_email123"

# Alert rule with multiple conditions
9n9s-cli alert-rule create \
  --name "Critical Service Alerts" \
  --conditions "tags:criticality=high,status_changes:DOWN,DEGRADED" \
  --actions "channel_id:chan_slack456,delay:0"

Advanced Alert Rule Configuration:

alert_rules:
    - name: "Production Service Alerts"
      conditions:
          monitor_tags:
              environment: production
              criticality: [high, critical]
          status_changes: [UP_TO_DOWN, UP_TO_DEGRADED]
          duration_threshold: 300 # 5 minutes
      actions:
          - channel_id: "chan_pagerduty"
            immediate: true
          - channel_id: "chan_slack"
            delay: 0
          - channel_id: "chan_email"
            delay: 300 # 5 minutes

Step 3: Test Alerting

Simulate Monitor Failure:

For heartbeat monitors:

# Don't send a pulse within the grace period, or send a failure pulse
curl -fsS -X POST \
  -d '{"error": "Test failure", "test": true}' \
  "https://pulse.9n9s.com/mon_abc123xyz/fail"

For uptime monitors:

Temporarily change the URL to a non-existent endpoint
Modify assertions to ensure they fail
Use a test endpoint that returns different status codes

Verify Alert Delivery:

Check your email/Slack/other notification channels
Verify the alert contains relevant information
Confirm recovery notifications work when the monitor returns to “Up”

Monitor Dashboard and Insights

Viewing Monitor Status

Web Dashboard:

Navigate to your project dashboard
View real-time monitor status
Check recent pulse/check history
Review performance trends

CLI Commands:

# List all monitors in project
9n9s-cli monitors list --project-id "proj_your_project_id"

# Get detailed monitor information
9n9s-cli heartbeat get mon_abc123xyz
9n9s-cli uptime get mon_xyz456abc

# View recent activity
9n9s-cli heartbeat logs mon_abc123xyz --limit 20
9n9s-cli uptime logs mon_xyz456abc --limit 20

Understanding Monitor Metrics

Heartbeat Monitor Metrics:

Last Pulse: When the last pulse was received
Status: Current monitor state (Up, Down, Late, etc.)
Runtime: How long the last job took to complete
Success Rate: Percentage of successful pulses
Average Runtime: Typical job execution time

Uptime Monitor Metrics:

Response Time: How quickly the endpoint responds
Uptime Percentage: Availability over time
Check Frequency: How often checks are performed
Assertion Results: Which validations pass/fail

Troubleshooting Common Issues

Heartbeat Monitor Issues

“Monitor shows as Down but job is running”:

Check if pulse URL is correct
Verify network connectivity from job server
Ensure job is actually sending pulses
Check grace period settings

“Pulses not being received”:

# Test pulse endpoint manually
curl -v "https://pulse.9n9s.com/mon_abc123xyz"

# Check for network issues
nslookup pulse.9n9s.com
curl -v https://pulse.9n9s.com/health

“Job running too long/short warnings”:

Review expected runtime settings
Analyze actual job performance
Adjust expected runtime bounds if needed

Uptime Monitor Issues

“False positive failures”:

Review assertion configuration
Check if endpoint behavior changed
Verify timeout settings are appropriate
Test endpoint manually with same parameters

“SSL/TLS certificate errors”:

# Check certificate validity
openssl s_client -connect api.example.com:443 -servername api.example.com

# Check certificate expiration
echo | openssl s_client -connect api.example.com:443 2>/dev/null | openssl x509 -noout -dates

General Issues

“No alerts received”:

Verify notification channels are configured correctly
Check alert rule conditions match your monitor
Test notification channels independently
Review alert rule logs

“Too many alerts”:

Adjust grace periods and thresholds
Use alert grouping and deduplication
Consider rate limiting for notifications
Review monitor sensitivity settings

Next Steps

Congratulations! You’ve successfully created your first monitor. Here’s what to do next:

Expand Your Monitoring

Add More Monitors: Create monitors for other critical services
Organize with Tags: Use consistent tagging for easy management
Create Projects: Separate monitors by environment or service
Monitor Dependencies: Add monitors for services your application depends on

Improve Your Setup

Customize Alerts: Fine-tune alert rules and notification channels
Add Team Members: Invite your team and configure appropriate permissions
Set Up Status Pages: Create public status pages for transparency
Use Infrastructure as Code: Manage monitors with Terraform or YAML

Learn Advanced Features

Payload Analysis: Learn to extract metrics from pulse payloads
Webhook Integration: Set up custom webhooks for automated responses
API Usage: Automate monitor management with the REST API
Advanced Scheduling: Use complex cron expressions and timezones

Resources

Monitor Types Guide: Detailed comparison of monitor types
Alerting Guide: Advanced alerting configurations
Team Setup: Adding team members and permissions
API Documentation: Complete API reference

You’re now ready to start monitoring your critical services with confidence. Remember that good monitoring is an iterative process - start simple and gradually add more sophisticated monitoring as you learn what works best for your systems.