Tags & Organization
Tags are key-value pairs that help you organize, filter, and manage your monitors. A well-designed tagging strategy is essential for maintaining clarity and control as your monitoring setup grows.
Understanding Tags
Section titled “Understanding Tags”Tags are metadata labels attached to monitors that enable:
- Organization: Group related monitors together
- Filtering: Quickly find specific monitors in the dashboard
- Alert Routing: Send alerts to different channels based on tags
- Bulk Operations: Apply changes to multiple monitors at once
- Reporting: Generate reports for specific services or teams
Tag Format
Section titled “Tag Format”Tags use a simple key:value format:
tags: environment: production team: backend criticality: high service: api component: authenticationEssential Tag Categories
Section titled “Essential Tag Categories”Environment Tags
Section titled “Environment Tags”Distinguish between different deployment environments:
environment: productionenvironment: stagingenvironment: developmentenvironment: testingenvironment: qaUsage Examples:
- Route production alerts to PagerDuty
- Send staging alerts only to Slack
- Exclude development monitors from SLA calculations
Team Ownership Tags
Section titled “Team Ownership Tags”Identify which team is responsible for each monitor:
team: backendteam: frontendteam: infrastructureteam: datateam: securityteam: qaBenefits:
- Route alerts to appropriate team channels
- Filter monitors by team responsibility
- Generate team-specific reports
- Assign ownership for incident response
Criticality Tags
Section titled “Criticality Tags”Indicate the business impact of monitor failures:
criticality: critical # Business-critical, immediate response requiredcriticality: high # Important services, response within hourscriticality: medium # Standard services, response during business hourscriticality: low # Non-critical, monitoring for trendsAlert Routing Example:
# Critical alertscriticality=critical → PagerDuty + SMS + Slack# High priority alertscriticality=high → Slack + Email# Medium priority alertscriticality=medium → Email only# Low priority alertscriticality=low → Dashboard onlyService Tags
Section titled “Service Tags”Group monitors by the service or application they monitor:
service: apiservice: websiteservice: databaseservice: cacheservice: queueservice: cdnComponent Tags
Section titled “Component Tags”Identify specific components within services:
component: authenticationcomponent: paymentcomponent: user-managementcomponent: analyticscomponent: reportingcomponent: notificationAdvanced Tagging Strategies
Section titled “Advanced Tagging Strategies”Version and Deployment Tags
Section titled “Version and Deployment Tags”Track which version or deployment is being monitored:
version: v2.1.0deployment: 2024-01-15release: stablebranch: feature-auth-v2Use Cases:
- Compare performance across versions
- Quickly identify monitors for specific deployments
- Rollback monitoring configurations
- Track deployment success rates
Geographic and Regional Tags
Section titled “Geographic and Regional Tags”For distributed systems and multi-region deployments:
region: us-east-1region: eu-west-1region: ap-southeast-1datacenter: awsprovider: azurezone: availability-zone-1aCustomer and Tenant Tags
Section titled “Customer and Tenant Tags”For multi-tenant applications:
tenant: customer-atenant: customer-bcustomer-tier: enterprisecustomer-tier: professionalinstance: sharedinstance: dedicatedTechnology Stack Tags
Section titled “Technology Stack Tags”Identify the technology or framework:
technology: nodejstechnology: pythontechnology: goframework: expressframework: djangoframework: gindatabase: postgresqldatabase: redisTag Best Practices
Section titled “Tag Best Practices”Consistency
Section titled “Consistency”Use Standardized Values:
# Good: Consistent valuesenvironment: production | staging | development
# Bad: Inconsistent valuesenvironment: prod | staging | dev | test | developmentEstablish Conventions:
- Use lowercase for all tag values
- Use hyphens for multi-word values
- Define allowed values for each tag key
- Create a tag glossary for your organization
Hierarchical Organization
Section titled “Hierarchical Organization”Create logical tag hierarchies:
# Service hierarchyservice: ecommercecomponent: checkoutsubcomponent: payment-processing
# Team hierarchyteam: engineeringsquad: backend-teamowner: john.doeRequired vs Optional Tags
Section titled “Required vs Optional Tags”Define which tags are required for all monitors:
Required Tags:
environment: production # Always requiredteam: backend # Always requiredcriticality: high # Always requiredOptional Tags:
version: v2.1.0 # Optional, but usefulcustomer: acme-corp # Only for customer-specific monitorsregion: us-east-1 # Only for region-specific monitorsUsing Tags for Organization
Section titled “Using Tags for Organization”Dashboard Filtering
Section titled “Dashboard Filtering”Create saved filter views using tags:
Production Critical Monitors:
environment:production AND criticality:criticalBackend Team Monitors:
team:backendPayment System Health:
component:payment OR service:paymentProject Organization
Section titled “Project Organization”Organize monitors into projects using tags:
# Production Services Projectproject_filters: - environment: production - criticality: [critical, high]
# Backend Team Projectproject_filters: - team: backend - environment: [production, staging]
# Customer-Specific Projectproject_filters: - customer: enterprise-client - environment: productionAlert Rule Targeting
Section titled “Alert Rule Targeting”Use tags to route alerts appropriately:
# Critical production alertsalert_rule: name: "Critical Production Alerts" conditions: tags: environment: production criticality: critical actions: - channel: pagerduty-critical - channel: slack-ops
# Team-specific alertsalert_rule: name: "Backend Team Alerts" conditions: tags: team: backend environment: production actions: - channel: slack-backend-teamBulk Operations with Tags
Section titled “Bulk Operations with Tags”Mass Updates
Section titled “Mass Updates”Update multiple monitors using tag filters:
CLI Examples:
# Add SLA tag to all production monitors9n9s-cli monitors update \ --filter "tags.environment=production" \ --add-tags "sla=99.9"
# Update criticality for all API monitors9n9s-cli monitors update \ --filter "tags.service=api" \ --set-tags "criticality=high"
# Remove deprecated tags9n9s-cli monitors update \ --filter "tags.deprecated=true" \ --remove-tags "deprecated"API Examples:
import requests
# Get monitors with specific tagsresponse = requests.get( "https://api.9n9s.com/v1/monitors", params={"tags": "environment:staging,team:backend"}, headers={"Authorization": "Bearer YOUR_API_KEY"})
monitors = response.json()["data"]
# Bulk update tagsfor monitor in monitors: new_tags = monitor["tags"].copy() new_tags["reviewed"] = "2024-01-15"
requests.patch( f"https://api.9n9s.com/v1/monitors/{monitor['id']}", json={"tags": new_tags}, headers={"Authorization": "Bearer YOUR_API_KEY"} )Configuration as Code
Section titled “Configuration as Code”Manage tags systematically using configuration files:
tag_defaults: production: environment: production criticality: high sla: 99.9
staging: environment: staging criticality: medium sla: 95.0
monitors: - name: "User API Health Check" type: uptime url: "https://api.example.com/health" inherit_tags: production additional_tags: team: backend service: api component: user-managementTag Analytics and Reporting
Section titled “Tag Analytics and Reporting”Monitor Distribution
Section titled “Monitor Distribution”Analyze your monitoring coverage:
# Count monitors by environment9n9s-cli monitors count --group-by environment
# Count monitors by team9n9s-cli monitors count --group-by team
# Count by criticality level9n9s-cli monitors count --group-by criticalitySLA Reporting
Section titled “SLA Reporting”Generate reports based on tags:
# Generate team-specific SLA reportdef generate_team_sla_report(team, period="30d"): monitors = get_monitors_by_tag(f"team:{team}")
report = { "team": team, "period": period, "monitors": [] }
for monitor in monitors: uptime = calculate_uptime(monitor["id"], period) report["monitors"].append({ "name": monitor["name"], "uptime": uptime, "sla_target": monitor["tags"].get("sla", "99.0"), "environment": monitor["tags"].get("environment") })
return reportCost Analysis
Section titled “Cost Analysis”Track monitoring costs by service or team:
# Calculate monitoring costs by tagdef calculate_monitoring_costs_by_tag(tag_key): monitors = get_all_monitors() costs_by_tag = {}
for monitor in monitors: tag_value = monitor["tags"].get(tag_key, "untagged")
# Calculate cost based on check frequency monthly_cost = calculate_monitor_cost(monitor)
if tag_value not in costs_by_tag: costs_by_tag[tag_value] = 0 costs_by_tag[tag_value] += monthly_cost
return costs_by_tag
# Usageteam_costs = calculate_monitoring_costs_by_tag("team")service_costs = calculate_monitoring_costs_by_tag("service")Tag Migration and Evolution
Section titled “Tag Migration and Evolution”Updating Tag Schemas
Section titled “Updating Tag Schemas”When evolving your tagging strategy:
1. Plan Migration:
# Old schemaold_tags: env: prod owner: john critical: true
# New schemanew_tags: environment: production team: backend owner: john.doe criticality: critical2. Gradual Migration:
# Step 1: Add new tags alongside old ones9n9s-cli monitors update \ --filter "tags.env=prod" \ --add-tags "environment=production"
# Step 2: Verify new tags are working9n9s-cli monitors list --tags environment=production
# Step 3: Remove old tags9n9s-cli monitors update \ --filter "tags.environment=production" \ --remove-tags "env"Tag Cleanup
Section titled “Tag Cleanup”Regular maintenance of tag schemas:
Find Unused Tags:
def find_unused_tags(): all_monitors = get_all_monitors() tag_usage = {}
for monitor in all_monitors: for tag_key, tag_value in monitor["tags"].items(): tag_pair = f"{tag_key}:{tag_value}" tag_usage[tag_pair] = tag_usage.get(tag_pair, 0) + 1
# Find tags used by only 1 monitor (potential typos) rare_tags = {tag: count for tag, count in tag_usage.items() if count == 1} return rare_tagsStandardize Tag Values:
# Find monitors with non-standard environment values9n9s-cli monitors list --tags environment --output json | \ jq '.[] | select(.tags.environment | test("prod|dev|test")) | .id'
# Standardize these values9n9s-cli monitors update \ --filter "tags.environment=prod" \ --set-tags "environment=production"Advanced Tag Features
Section titled “Advanced Tag Features”Dynamic Tag Generation
Section titled “Dynamic Tag Generation”Automatically generate tags based on monitor configuration:
def generate_automatic_tags(monitor): """Generate tags based on monitor properties""" auto_tags = {}
# Generate tags from URL for uptime monitors if monitor["type"] == "uptime": url = monitor["url"]
# Extract domain domain = extract_domain(url) auto_tags["domain"] = domain
# Detect protocol auto_tags["protocol"] = "https" if url.startswith("https") else "http"
# Detect common services if "api." in domain or "/api/" in url: auto_tags["service_type"] = "api" elif "www." in domain: auto_tags["service_type"] = "website"
# Generate tags from schedule for heartbeat monitors if monitor["type"] == "heartbeat": schedule = monitor["schedule"]
if "* * * * *" in schedule: auto_tags["frequency"] = "high" elif "0 * * * *" in schedule: auto_tags["frequency"] = "hourly" elif "0 0 * * *" in schedule: auto_tags["frequency"] = "daily"
return auto_tagsTag Validation
Section titled “Tag Validation”Enforce tag schema compliance:
def validate_monitor_tags(monitor): """Validate monitor tags against schema""" required_tags = ["environment", "team", "criticality"] errors = []
# Check required tags for tag in required_tags: if tag not in monitor["tags"]: errors.append(f"Missing required tag: {tag}")
# Validate tag values valid_environments = ["production", "staging", "development"] if "environment" in monitor["tags"]: if monitor["tags"]["environment"] not in valid_environments: errors.append(f"Invalid environment: {monitor['tags']['environment']}")
valid_criticalities = ["critical", "high", "medium", "low"] if "criticality" in monitor["tags"]: if monitor["tags"]["criticality"] not in valid_criticalities: errors.append(f"Invalid criticality: {monitor['tags']['criticality']}")
return errors