Core Concepts
9n9s is built around several core concepts that work together to provide comprehensive monitoring for your systems and processes. Understanding these concepts will help you make the most of the platform.
Organizations
Section titled “Organizations”An Organization is the top-level container for all your resources in 9n9s. When you sign up, you create your first organization. Organizations serve several purposes:
- Billing & Subscriptions: Each organization has its own subscription plan and billing
- Team Management: Invite team members and manage their access
- Global Settings: Configure organization-wide notification channels and security policies
- Resource Isolation: Keep different teams or environments completely separate
Organization Features
Section titled “Organization Features”- Custom Branding: Upload logos and customize the appearance of status pages
- SSO Integration: Connect with SAML/OIDC providers for enterprise authentication
- Audit Logging: Track all changes and access across the organization
- API Keys: Generate organization-scoped API keys with configurable permissions
Projects
Section titled “Projects”Projects are containers within an organization that group related monitors together. Think of projects as “folders” for organizing your monitoring based on:
- Applications: Separate projects for different apps or services
- Environments: Different projects for production, staging, development
- Teams: Project per team or department
- Infrastructure Components: Projects for databases, APIs, background services
Project Benefits
Section titled “Project Benefits”- Organized Monitoring: Keep related monitors together for easier management
- Scoped Alerting: Configure different alert rules and channels per project
- Team Permissions: Grant team members specific access to individual projects
- Logical Grouping: Filter and search monitors by project across the platform
Monitors
Section titled “Monitors”Monitors are the core entities that track the health and availability of your systems. 9n9s supports two primary types of monitors:
Heartbeat Monitors (Push-Based)
Section titled “Heartbeat Monitors (Push-Based)”Heartbeat monitors work like a “dead man’s switch” - your systems actively signal to 9n9s that they’re running correctly. If 9n9s doesn’t receive a signal within the expected timeframe, it triggers an alert.
Perfect for:
- Cron jobs and scheduled tasks
- Background workers and queue consumers
- ETL and data processing pipelines
- Deployment scripts and automation
- Serverless functions
- Batch processes
Uptime Monitors (Pull-Based)
Section titled “Uptime Monitors (Pull-Based)”Uptime monitors proactively check your services from 9n9s infrastructure by making requests to your endpoints and validating the responses.
Perfect for:
- Website availability monitoring
- API health checks
- Service endpoint monitoring
- SSL certificate monitoring
- Response time tracking
Monitor States
Section titled “Monitor States”Monitors can be in one of several states that indicate their current health:
Heartbeat Monitor States
Section titled “Heartbeat Monitor States”New: Monitor created but hasn’t received any pulses yetUp: Recently received a successful pulse within the expected scheduleDown: Failed to receive a pulse within schedule + grace periodLate: Pulse is overdue but still within grace periodStarted: Job has signaled it started but hasn’t completed yetDegraded: Received pulse but runtime was outside expected boundsPaused: Monitoring temporarily disabled by user
Uptime Monitor States
Section titled “Uptime Monitor States”New: Monitor created but hasn’t run any checks yetUp: Latest check passed all configured assertionsDown: Latest check failed one or more assertionsPaused: Checking temporarily disabled by user
Schedules and Timing
Section titled “Schedules and Timing”Heartbeat Schedules
Section titled “Heartbeat Schedules”Heartbeat monitors use schedules to define when pulses are expected:
Interval Schedules: Simple time-based intervals
every 5 minutes(300 seconds)hourly(3600 seconds)daily(86400 seconds)
Cron Expressions: Precise scheduling using standard cron syntax
*/15 * * * *(every 15 minutes)0 9 * * 1-5(weekdays at 9 AM)0 2 * * 0(Sundays at 2 AM)
Grace Periods
Section titled “Grace Periods”Grace periods provide buffer time after the expected pulse time before marking a monitor as down. This accounts for:
- Natural variance in job execution times
- System load and resource contention
- Network delays and temporary issues
Expected Runtime Tracking
Section titled “Expected Runtime Tracking”For processes where execution time matters, you can define expected minimum and maximum runtimes. If a job completes outside these bounds, the monitor enters a “Degraded” state, indicating potential issues even if the job succeeded.
Pulse Endpoints and Signaling
Section titled “Pulse Endpoints and Signaling”Heartbeat monitors use unique, secret URLs for receiving signals:
Basic Signaling
Section titled “Basic Signaling”GET/POST /{uuid}- Signals successful completionGET/POST /{uuid}/start- Signals job start (enables runtime tracking)GET/POST /{uuid}/fail- Signals job failureGET/POST /{uuid}/{exit_code}- Signals with specific exit code
Payload and Context
Section titled “Payload and Context”POST requests can include payloads up to 1MB containing:
- Log Output: Capture stdout/stderr for debugging
- Error Messages: Detailed failure information
- Metrics Data: Custom metrics and performance data
- Contextual Information: Environment details, parameters, etc.
Alerting and Notifications
Section titled “Alerting and Notifications”Notification Channels
Section titled “Notification Channels”Configure how and where you receive alerts:
- Email - Direct email notifications
- SMS - Text message alerts via Twilio
- Slack - Messages to channels or direct messages
- Discord - Notifications to Discord channels
- PagerDuty - Incident creation and escalation
- Webhooks - Custom HTTP callbacks
- Many More - Teams, Telegram, Pushover, etc.
Alert Rules
Section titled “Alert Rules”Define when and how alerts are triggered:
- Trigger Conditions: Which monitor states trigger alerts
- Filtering: Route alerts based on tags, projects, or other criteria
- Timing: Immediate alerts vs. waiting for sustained issues
- Recovery Notifications: Get notified when issues resolve
Smart Alert Management
Section titled “Smart Alert Management”- Deduplication: Prevent spam from flapping monitors
- Grouping: Consolidate related failures into single notifications
- Escalation: Route to different channels based on severity or duration
- Maintenance Windows: Temporarily suppress alerts during planned work
Tags and Organization
Section titled “Tags and Organization”Tags are key-value pairs that help organize and filter monitors:
Common Tagging Strategies
Section titled “Common Tagging Strategies”By Environment:
environment: productionenvironment: stagingBy Criticality:
criticality: highcriticality: mediumcriticality: lowBy Component:
component: databasecomponent: apicomponent: frontendBy Team:
team: backendteam: infrastructureteam: dataTag Benefits
Section titled “Tag Benefits”- Filtering: Quickly find monitors by tags in the UI
- Alert Routing: Send alerts for specific tagged monitors to different channels
- Bulk Operations: Apply changes to all monitors with certain tags
- Reporting: Generate reports and analytics grouped by tags
Context and Payload Analysis
Section titled “Context and Payload Analysis”Log Capture and Analysis
Section titled “Log Capture and Analysis”9n9s automatically captures and indexes payload data from your heartbeat pulses:
- Full Text Search: Search through logs and output across all monitors
- Structured Data: JSON payloads are parsed and made searchable
- Historical Retention: Logs retained based on your subscription plan
- Export Capabilities: Download logs for external analysis
Payload Content Scanning
Section titled “Payload Content Scanning”Configure 9n9s to scan payload content for specific patterns:
- Success Keywords: Automatically mark pulses as successful based on content
- Failure Keywords: Detect failures from error messages in logs
- Custom Patterns: Use regex patterns for complex matching rules
Metrics Extraction
Section titled “Metrics Extraction”Extract time-series metrics from JSON payloads:
{ "files_processed": 1247, "processing_time_ms": 5432, "memory_usage_mb": 256, "error_count": 0}Define JSONPath expressions to extract values and track them as metrics over time.
Team Collaboration
Section titled “Team Collaboration”Role-Based Access Control (RBAC)
Section titled “Role-Based Access Control (RBAC)”Organization Roles:
- Owner: Full control including billing and organization deletion
- Admin: All permissions except billing and deletion
- Member: Standard access to assigned projects
- Viewer: Read-only access across the organization
Project Roles:
- Admin: Full control over project and its monitors
- Editor: Create, modify, and delete monitors and alert rules
- Viewer: Read-only access to project monitors and logs
Team Features
Section titled “Team Features”- Member Invitations: Invite team members via email
- Permission Management: Fine-grained control over access
- Activity Tracking: See who made changes and when
- Collaborative Workflows: Share monitors and alert configurations
API and Automation
Section titled “API and Automation”REST API
Section titled “REST API”Complete programmatic access to all platform features:
- Monitor Management: Create, update, delete monitors
- Organization Control: Manage projects, teams, settings
- Data Access: Query monitor status, logs, and metrics
- Alert Configuration: Automate notification setup
SDK Libraries
Section titled “SDK Libraries”Official libraries for popular languages:
- Python:
pip install 9n9s - Node.js:
npm install @9n9s/sdk - Go:
go get github.com/9n9s-io/9n9s-go - More Coming: Java, .NET, PHP, Ruby
CLI Tool
Section titled “CLI Tool”Command-line interface for automation and CI/CD:
- Monitor Management: Create and configure monitors
- Pulse Sending: Signal from shell scripts and automation
- Configuration Sync: Apply Infrastructure as Code definitions
- Status Querying: Check monitor status in scripts
Configuration as Code
Section titled “Configuration as Code”Infrastructure as Code Integration
Section titled “Infrastructure as Code Integration”Terraform Provider:
resource "nines_heartbeat_monitor" "backup_job" { name = "Daily Backup" project_id = nines_project.infrastructure.id schedule = "0 2 * * *" grace_period = "1h"}Pulumi Support:
const backupMonitor = new nines.HeartbeatMonitor("backup", { name: "Daily Backup", projectId: infraProject.id, schedule: "0 2 * * *", gracePeriod: "1h",});YAML Definitions:
projects: infrastructure: heartbeats: - name: "Daily Backup" schedule: "0 2 * * *" grace_period: "1h" tags: environment: production criticality: highGitOps Workflows
Section titled “GitOps Workflows”- Version Control: Store monitor configurations in Git
- Code Review: Review monitoring changes like code changes
- Automated Deployment: Apply changes via CI/CD pipelines
- Rollback Capability: Easily revert configuration changes
Understanding these core concepts provides the foundation for effectively using 9n9s to monitor your critical systems and processes. Each concept builds on the others to create a comprehensive, flexible monitoring platform that scales with your needs.