Skip to content

Creating Your First Monitor

Creating your first monitor in 9n9s is designed to be quick and straightforward. This guide will walk you through the entire process, from choosing the right monitor type to receiving your first successful pulse or check result.

Prerequisites:

  • ✅ 9n9s account created and verified
  • ✅ Organization and first project set up
  • ✅ Basic understanding of your monitoring needs

Choose Your Monitor Type:

  • Heartbeat Monitor: For processes that can actively signal (cron jobs, scripts, applications)
  • Uptime Monitor: For checking external endpoints (websites, APIs, services)

If you’re unsure which type to use, check our Monitor Types Guide for detailed comparisons.

Heartbeat monitors are perfect for monitoring scheduled tasks, background jobs, and any process that can send an HTTP request.

Via Web Interface:

  1. Navigate to Your Project

    • Log into your 9n9s dashboard
    • Select your organization and project
    • Click “Add Monitor” or “Create Monitor”
  2. Choose Monitor Type

    • Select “Heartbeat Monitor”
    • This will open the heartbeat configuration form
  3. Basic Configuration

    Monitor Name: "Daily Database Backup"
    Description: "Monitors the daily PostgreSQL backup job"
  4. Schedule Settings

    Schedule Type: Cron Expression
    Cron Expression: 0 2 * * * (Daily at 2 AM)
    Timezone: UTC (or your preferred timezone)
    Grace Period: 30 minutes
  5. Optional Advanced Settings

    Expected Runtime:
    - Minimum: 5 minutes
    - Maximum: 60 minutes
    Tags:
    - environment: production
    - service: database
    - criticality: high
  6. Create Monitor

    • Click “Create Monitor”
    • Note the unique Monitor ID (e.g., mon_abc123xyz)
    • Copy the pulse endpoint URL

Via CLI:

Terminal window
# Create a basic heartbeat monitor
9n9s-cli heartbeat create \
--name "Daily Database Backup" \
--description "Monitors the daily PostgreSQL backup job" \
--schedule "0 2 * * *" \
--grace-period 1800 \
--timezone "UTC" \
--project-id "proj_your_project_id"
# Create with tags and expected runtime
9n9s-cli heartbeat create \
--name "Daily Database Backup" \
--schedule "0 2 * * *" \
--grace-period 1800 \
--expected-runtime-min 300 \
--expected-runtime-max 3600 \
--tags "environment:production,service:database,criticality:high"

Now you need to modify your process to send signals to 9n9s. Here are examples for common scenarios:

Shell Script Example:

#!/bin/bash
# backup.sh - PostgreSQL backup script
# Your monitor pulse endpoint
PULSE_URL="https://pulse.9n9s.com/mon_abc123xyz"
# Signal that backup is starting
curl -fsS "$PULSE_URL/start"
# Perform the backup
echo "Starting database backup..."
if pg_dump -h localhost -U backup_user -d production_db -f "/backups/backup_$(date +%Y%m%d_%H%M%S).sql"; then
# Backup succeeded
echo "Backup completed successfully"
# Send success signal with details
BACKUP_SIZE=$(du -h /backups/backup_*.sql | tail -1 | cut -f1)
curl -fsS -X POST \
-H "Content-Type: application/json" \
-d "{\"message\": \"Backup completed successfully\", \"backup_size\": \"$BACKUP_SIZE\"}" \
"$PULSE_URL"
else
# Backup failed
echo "Backup failed!"
# Send failure signal
curl -fsS -X POST \
-H "Content-Type: application/json" \
-d "{\"error\": \"pg_dump failed with exit code $?\", \"timestamp\": \"$(date -Iseconds)\"}" \
"$PULSE_URL/fail"
exit 1
fi

Python Script Example:

#!/usr/bin/env python3
# backup.py - Database backup with monitoring
import subprocess
import requests
import json
import datetime
import sys
# Your monitor pulse endpoint
PULSE_URL = "https://pulse.9n9s.com/mon_abc123xyz"
def send_pulse(action=None, data=None):
"""Send a pulse to 9n9s"""
url = PULSE_URL
if action:
url += f"/{action}"
try:
if data:
response = requests.post(url, json=data, timeout=10)
else:
response = requests.get(url, timeout=10)
response.raise_for_status()
print(f"✅ Pulse sent successfully: {action or 'success'}")
except requests.RequestException as e:
print(f"⚠️ Failed to send pulse: {e}")
def main():
# Signal start
send_pulse("start")
try:
# Perform backup
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
backup_file = f"/backups/backup_{timestamp}.sql"
print("Starting database backup...")
result = subprocess.run([
"pg_dump",
"-h", "localhost",
"-U", "backup_user",
"-d", "production_db",
"-f", backup_file
], check=True, capture_output=True, text=True)
# Calculate backup size
import os
backup_size_mb = os.path.getsize(backup_file) / (1024 * 1024)
# Send success pulse with details
send_pulse(data={
"message": "Backup completed successfully",
"backup_file": backup_file,
"backup_size_mb": round(backup_size_mb, 2),
"duration_seconds": 300 # You could measure actual duration
})
print(f"✅ Backup completed: {backup_file} ({backup_size_mb:.2f} MB)")
except subprocess.CalledProcessError as e:
# Send failure pulse
send_pulse("fail", {
"error": f"pg_dump failed with exit code {e.returncode}",
"stderr": e.stderr,
"timestamp": datetime.datetime.now().isoformat()
})
print(f"❌ Backup failed: {e}")
sys.exit(1)
except Exception as e:
# Send failure pulse for unexpected errors
send_pulse("fail", {
"error": f"Unexpected error: {str(e)}",
"error_type": type(e).__name__,
"timestamp": datetime.datetime.now().isoformat()
})
print(f"❌ Unexpected error: {e}")
sys.exit(1)
if __name__ == "__main__":
main()

Cron Job Setup:

Terminal window
# Add to your crontab (crontab -e)
# Run backup daily at 2 AM
0 2 * * * /path/to/backup.sh >> /var/log/backup.log 2>&1

Manual Test:

Terminal window
# Test the pulse endpoint manually
curl -fsS "https://pulse.9n9s.com/mon_abc123xyz"
# Test with payload data
curl -fsS -X POST \
-H "Content-Type: application/json" \
-d '{"test": true, "message": "Manual test pulse"}' \
"https://pulse.9n9s.com/mon_abc123xyz"
# Test failure scenario
curl -fsS -X POST \
-H "Content-Type: application/json" \
-d '{"test": true, "error": "Simulated failure"}' \
"https://pulse.9n9s.com/mon_abc123xyz/fail"

Check Monitor Status:

Terminal window
# View monitor status via CLI
9n9s-cli heartbeat get mon_abc123xyz
# View recent pulse logs
9n9s-cli heartbeat logs mon_abc123xyz --limit 10

Uptime monitors continuously check external endpoints to ensure they’re responding correctly.

Via Web Interface:

  1. Navigate to Your Project

    • Select “Add Monitor”“Uptime Monitor”
  2. Basic Configuration

    Monitor Name: "API Health Check"
    Description: "Monitors the main API endpoint"
    URL: https://api.example.com/health
  3. Check Configuration

    HTTP Method: GET
    Check Frequency: Every 1 minute
    Timeout: 10 seconds
  4. Assertions (Success Criteria)

    Status Code: 200
    Response Time: Less than 2000ms
    Response Body Contains: "status": "healthy"
  5. Optional Settings

    Custom Headers:
    - Authorization: Bearer your-api-token
    - User-Agent: 9n9s-Monitor/1.0
    Tags:
    - service: api
    - environment: production
    - criticality: high

Via CLI:

Terminal window
# Create a basic uptime monitor
9n9s-cli uptime create \
--name "API Health Check" \
--url "https://api.example.com/health" \
--frequency 60 \
--timeout 10 \
--project-id "proj_your_project_id"
# Create with assertions and headers
9n9s-cli uptime create \
--name "API Health Check" \
--url "https://api.example.com/health" \
--frequency 60 \
--headers "Authorization:Bearer token123" \
--assert-status 200 \
--assert-response-time 2000 \
--assert-content '"status":"healthy"' \
--tags "service:api,environment:production"

For more complex validation, you can configure multiple assertions:

Status Code Validation:

assertions:
- type: STATUS_CODE
operator: EQUALS
value: 200

Response Time Validation:

assertions:
- type: RESPONSE_TIME
operator: LESS_THAN
value: 2000 # milliseconds

Content Validation:

assertions:
- type: RESPONSE_BODY
operator: CONTAINS
value: '"status":"healthy"'
- type: JSON_CONTENT
path: "$.status"
operator: EQUALS
value: "healthy"

Header Validation:

assertions:
- type: RESPONSE_HEADER
header: "Content-Type"
operator: CONTAINS
value: "application/json"

Manual Check:

Terminal window
# Trigger a manual check
9n9s-cli uptime check mon_xyz456abc
# View check results
9n9s-cli uptime get mon_xyz456abc --include-recent-checks
# View check history
9n9s-cli uptime logs mon_xyz456abc --limit 5

Verify Endpoint Manually:

Terminal window
# Test the endpoint yourself
curl -v -H "Authorization: Bearer your-token" \
https://api.example.com/health
# Time the response
time curl -s https://api.example.com/health > /dev/null

Once your monitor is created, you’ll want to be notified when issues occur.

Email Notification:

Terminal window
# Add email notification channel
9n9s-cli notification-channel create \
--type email \
--name "Operations Team" \
--config '{"email": "[email protected]"}'

Slack Notification:

Terminal window
# Add Slack webhook notification
9n9s-cli notification-channel create \
--type slack \
--name "Alerts Channel" \
--config '{"webhook_url": "https://hooks.slack.com/services/..."}'

Basic Alert Rule:

Terminal window
# Create alert rule for your monitor
9n9s-cli alert-rule create \
--name "Database Backup Alerts" \
--project-id "proj_your_project_id" \
--conditions "monitor_ids:mon_abc123xyz,status_changes:DOWN" \
--actions "channel_id:chan_email123"
# Alert rule with multiple conditions
9n9s-cli alert-rule create \
--name "Critical Service Alerts" \
--conditions "tags:criticality=high,status_changes:DOWN,DEGRADED" \
--actions "channel_id:chan_slack456,delay:0"

Advanced Alert Rule Configuration:

alert-config.yml
alert_rules:
- name: "Production Service Alerts"
conditions:
monitor_tags:
environment: production
criticality: [high, critical]
status_changes: [UP_TO_DOWN, UP_TO_DEGRADED]
duration_threshold: 300 # 5 minutes
actions:
- channel_id: "chan_pagerduty"
immediate: true
- channel_id: "chan_slack"
delay: 0
- channel_id: "chan_email"
delay: 300 # 5 minutes

Simulate Monitor Failure:

For heartbeat monitors:

Terminal window
# Don't send a pulse within the grace period, or send a failure pulse
curl -fsS -X POST \
-d '{"error": "Test failure", "test": true}' \
"https://pulse.9n9s.com/mon_abc123xyz/fail"

For uptime monitors:

  • Temporarily change the URL to a non-existent endpoint
  • Modify assertions to ensure they fail
  • Use a test endpoint that returns different status codes

Verify Alert Delivery:

  1. Check your email/Slack/other notification channels
  2. Verify the alert contains relevant information
  3. Confirm recovery notifications work when the monitor returns to “Up”

Web Dashboard:

  • Navigate to your project dashboard
  • View real-time monitor status
  • Check recent pulse/check history
  • Review performance trends

CLI Commands:

Terminal window
# List all monitors in project
9n9s-cli monitors list --project-id "proj_your_project_id"
# Get detailed monitor information
9n9s-cli heartbeat get mon_abc123xyz
9n9s-cli uptime get mon_xyz456abc
# View recent activity
9n9s-cli heartbeat logs mon_abc123xyz --limit 20
9n9s-cli uptime logs mon_xyz456abc --limit 20

Heartbeat Monitor Metrics:

  • Last Pulse: When the last pulse was received
  • Status: Current monitor state (Up, Down, Late, etc.)
  • Runtime: How long the last job took to complete
  • Success Rate: Percentage of successful pulses
  • Average Runtime: Typical job execution time

Uptime Monitor Metrics:

  • Response Time: How quickly the endpoint responds
  • Uptime Percentage: Availability over time
  • Check Frequency: How often checks are performed
  • Assertion Results: Which validations pass/fail

“Monitor shows as Down but job is running”:

  • Check if pulse URL is correct
  • Verify network connectivity from job server
  • Ensure job is actually sending pulses
  • Check grace period settings

“Pulses not being received”:

Terminal window
# Test pulse endpoint manually
curl -v "https://pulse.9n9s.com/mon_abc123xyz"
# Check for network issues
nslookup pulse.9n9s.com
curl -v https://pulse.9n9s.com/health

“Job running too long/short warnings”:

  • Review expected runtime settings
  • Analyze actual job performance
  • Adjust expected runtime bounds if needed

“False positive failures”:

  • Review assertion configuration
  • Check if endpoint behavior changed
  • Verify timeout settings are appropriate
  • Test endpoint manually with same parameters

“SSL/TLS certificate errors”:

Terminal window
# Check certificate validity
openssl s_client -connect api.example.com:443 -servername api.example.com
# Check certificate expiration
echo | openssl s_client -connect api.example.com:443 2>/dev/null | openssl x509 -noout -dates

“No alerts received”:

  • Verify notification channels are configured correctly
  • Check alert rule conditions match your monitor
  • Test notification channels independently
  • Review alert rule logs

“Too many alerts”:

  • Adjust grace periods and thresholds
  • Use alert grouping and deduplication
  • Consider rate limiting for notifications
  • Review monitor sensitivity settings

Congratulations! You’ve successfully created your first monitor. Here’s what to do next:

  1. Add More Monitors: Create monitors for other critical services
  2. Organize with Tags: Use consistent tagging for easy management
  3. Create Projects: Separate monitors by environment or service
  4. Monitor Dependencies: Add monitors for services your application depends on
  1. Customize Alerts: Fine-tune alert rules and notification channels
  2. Add Team Members: Invite your team and configure appropriate permissions
  3. Set Up Status Pages: Create public status pages for transparency
  4. Use Infrastructure as Code: Manage monitors with Terraform or YAML
  1. Payload Analysis: Learn to extract metrics from pulse payloads
  2. Webhook Integration: Set up custom webhooks for automated responses
  3. API Usage: Automate monitor management with the REST API
  4. Advanced Scheduling: Use complex cron expressions and timezones

You’re now ready to start monitoring your critical services with confidence. Remember that good monitoring is an iterative process - start simple and gradually add more sophisticated monitoring as you learn what works best for your systems.