Skip to content

Tags & Organization

Tags are key-value pairs that help you organize, filter, and manage your monitors. A well-designed tagging strategy is essential for maintaining clarity and control as your monitoring setup grows.

Tags are metadata labels attached to monitors that enable:

  • Organization: Group related monitors together
  • Filtering: Quickly find specific monitors in the dashboard
  • Alert Routing: Send alerts to different channels based on tags
  • Bulk Operations: Apply changes to multiple monitors at once
  • Reporting: Generate reports for specific services or teams

Tags use a simple key:value format:

tags:
environment: production
team: backend
criticality: high
service: api
component: authentication

Distinguish between different deployment environments:

environment: production
environment: staging
environment: development
environment: testing
environment: qa

Usage Examples:

  • Route production alerts to PagerDuty
  • Send staging alerts only to Slack
  • Exclude development monitors from SLA calculations

Identify which team is responsible for each monitor:

team: backend
team: frontend
team: infrastructure
team: data
team: security
team: qa

Benefits:

  • Route alerts to appropriate team channels
  • Filter monitors by team responsibility
  • Generate team-specific reports
  • Assign ownership for incident response

Indicate the business impact of monitor failures:

criticality: critical # Business-critical, immediate response required
criticality: high # Important services, response within hours
criticality: medium # Standard services, response during business hours
criticality: low # Non-critical, monitoring for trends

Alert Routing Example:

# Critical alerts
criticality=critical → PagerDuty + SMS + Slack
# High priority alerts
criticality=high → Slack + Email
# Medium priority alerts
criticality=medium → Email only
# Low priority alerts
criticality=low → Dashboard only

Group monitors by the service or application they monitor:

service: api
service: website
service: database
service: cache
service: queue
service: cdn

Identify specific components within services:

component: authentication
component: payment
component: user-management
component: analytics
component: reporting
component: notification

Track which version or deployment is being monitored:

version: v2.1.0
deployment: 2024-01-15
release: stable
branch: feature-auth-v2

Use Cases:

  • Compare performance across versions
  • Quickly identify monitors for specific deployments
  • Rollback monitoring configurations
  • Track deployment success rates

For distributed systems and multi-region deployments:

region: us-east-1
region: eu-west-1
region: ap-southeast-1
datacenter: aws
provider: azure
zone: availability-zone-1a

For multi-tenant applications:

tenant: customer-a
tenant: customer-b
customer-tier: enterprise
customer-tier: professional
instance: shared
instance: dedicated

Identify the technology or framework:

technology: nodejs
technology: python
technology: go
framework: express
framework: django
framework: gin
database: postgresql
database: redis

Use Standardized Values:

# Good: Consistent values
environment: production | staging | development
# Bad: Inconsistent values
environment: prod | staging | dev | test | development

Establish Conventions:

  • Use lowercase for all tag values
  • Use hyphens for multi-word values
  • Define allowed values for each tag key
  • Create a tag glossary for your organization

Create logical tag hierarchies:

# Service hierarchy
service: ecommerce
component: checkout
subcomponent: payment-processing
# Team hierarchy
team: engineering
squad: backend-team
owner: john.doe

Define which tags are required for all monitors:

Required Tags:

environment: production # Always required
team: backend # Always required
criticality: high # Always required

Optional Tags:

version: v2.1.0 # Optional, but useful
customer: acme-corp # Only for customer-specific monitors
region: us-east-1 # Only for region-specific monitors

Create saved filter views using tags:

Production Critical Monitors:

environment:production AND criticality:critical

Backend Team Monitors:

team:backend

Payment System Health:

component:payment OR service:payment

Organize monitors into projects using tags:

# Production Services Project
project_filters:
- environment: production
- criticality: [critical, high]
# Backend Team Project
project_filters:
- team: backend
- environment: [production, staging]
# Customer-Specific Project
project_filters:
- customer: enterprise-client
- environment: production

Use tags to route alerts appropriately:

# Critical production alerts
alert_rule:
name: "Critical Production Alerts"
conditions:
tags:
environment: production
criticality: critical
actions:
- channel: pagerduty-critical
- channel: slack-ops
# Team-specific alerts
alert_rule:
name: "Backend Team Alerts"
conditions:
tags:
team: backend
environment: production
actions:
- channel: slack-backend-team

Update multiple monitors using tag filters:

CLI Examples:

Terminal window
# Add SLA tag to all production monitors
9n9s-cli monitors update \
--filter "tags.environment=production" \
--add-tags "sla=99.9"
# Update criticality for all API monitors
9n9s-cli monitors update \
--filter "tags.service=api" \
--set-tags "criticality=high"
# Remove deprecated tags
9n9s-cli monitors update \
--filter "tags.deprecated=true" \
--remove-tags "deprecated"

API Examples:

import requests
# Get monitors with specific tags
response = requests.get(
"https://api.9n9s.com/v1/monitors",
params={"tags": "environment:staging,team:backend"},
headers={"Authorization": "Bearer YOUR_API_KEY"}
)
monitors = response.json()["data"]
# Bulk update tags
for monitor in monitors:
new_tags = monitor["tags"].copy()
new_tags["reviewed"] = "2024-01-15"
requests.patch(
f"https://api.9n9s.com/v1/monitors/{monitor['id']}",
json={"tags": new_tags},
headers={"Authorization": "Bearer YOUR_API_KEY"}
)

Manage tags systematically using configuration files:

monitors.yml
tag_defaults:
production:
environment: production
criticality: high
sla: 99.9
staging:
environment: staging
criticality: medium
sla: 95.0
monitors:
- name: "User API Health Check"
type: uptime
url: "https://api.example.com/health"
inherit_tags: production
additional_tags:
team: backend
service: api
component: user-management

Analyze your monitoring coverage:

Terminal window
# Count monitors by environment
9n9s-cli monitors count --group-by environment
# Count monitors by team
9n9s-cli monitors count --group-by team
# Count by criticality level
9n9s-cli monitors count --group-by criticality

Generate reports based on tags:

# Generate team-specific SLA report
def generate_team_sla_report(team, period="30d"):
monitors = get_monitors_by_tag(f"team:{team}")
report = {
"team": team,
"period": period,
"monitors": []
}
for monitor in monitors:
uptime = calculate_uptime(monitor["id"], period)
report["monitors"].append({
"name": monitor["name"],
"uptime": uptime,
"sla_target": monitor["tags"].get("sla", "99.0"),
"environment": monitor["tags"].get("environment")
})
return report

Track monitoring costs by service or team:

# Calculate monitoring costs by tag
def calculate_monitoring_costs_by_tag(tag_key):
monitors = get_all_monitors()
costs_by_tag = {}
for monitor in monitors:
tag_value = monitor["tags"].get(tag_key, "untagged")
# Calculate cost based on check frequency
monthly_cost = calculate_monitor_cost(monitor)
if tag_value not in costs_by_tag:
costs_by_tag[tag_value] = 0
costs_by_tag[tag_value] += monthly_cost
return costs_by_tag
# Usage
team_costs = calculate_monitoring_costs_by_tag("team")
service_costs = calculate_monitoring_costs_by_tag("service")

When evolving your tagging strategy:

1. Plan Migration:

# Old schema
old_tags:
env: prod
owner: john
critical: true
# New schema
new_tags:
environment: production
team: backend
owner: john.doe
criticality: critical

2. Gradual Migration:

Terminal window
# Step 1: Add new tags alongside old ones
9n9s-cli monitors update \
--filter "tags.env=prod" \
--add-tags "environment=production"
# Step 2: Verify new tags are working
9n9s-cli monitors list --tags environment=production
# Step 3: Remove old tags
9n9s-cli monitors update \
--filter "tags.environment=production" \
--remove-tags "env"

Regular maintenance of tag schemas:

Find Unused Tags:

def find_unused_tags():
all_monitors = get_all_monitors()
tag_usage = {}
for monitor in all_monitors:
for tag_key, tag_value in monitor["tags"].items():
tag_pair = f"{tag_key}:{tag_value}"
tag_usage[tag_pair] = tag_usage.get(tag_pair, 0) + 1
# Find tags used by only 1 monitor (potential typos)
rare_tags = {tag: count for tag, count in tag_usage.items() if count == 1}
return rare_tags

Standardize Tag Values:

Terminal window
# Find monitors with non-standard environment values
9n9s-cli monitors list --tags environment --output json | \
jq '.[] | select(.tags.environment | test("prod|dev|test")) | .id'
# Standardize these values
9n9s-cli monitors update \
--filter "tags.environment=prod" \
--set-tags "environment=production"

Automatically generate tags based on monitor configuration:

def generate_automatic_tags(monitor):
"""Generate tags based on monitor properties"""
auto_tags = {}
# Generate tags from URL for uptime monitors
if monitor["type"] == "uptime":
url = monitor["url"]
# Extract domain
domain = extract_domain(url)
auto_tags["domain"] = domain
# Detect protocol
auto_tags["protocol"] = "https" if url.startswith("https") else "http"
# Detect common services
if "api." in domain or "/api/" in url:
auto_tags["service_type"] = "api"
elif "www." in domain:
auto_tags["service_type"] = "website"
# Generate tags from schedule for heartbeat monitors
if monitor["type"] == "heartbeat":
schedule = monitor["schedule"]
if "* * * * *" in schedule:
auto_tags["frequency"] = "high"
elif "0 * * * *" in schedule:
auto_tags["frequency"] = "hourly"
elif "0 0 * * *" in schedule:
auto_tags["frequency"] = "daily"
return auto_tags

Enforce tag schema compliance:

def validate_monitor_tags(monitor):
"""Validate monitor tags against schema"""
required_tags = ["environment", "team", "criticality"]
errors = []
# Check required tags
for tag in required_tags:
if tag not in monitor["tags"]:
errors.append(f"Missing required tag: {tag}")
# Validate tag values
valid_environments = ["production", "staging", "development"]
if "environment" in monitor["tags"]:
if monitor["tags"]["environment"] not in valid_environments:
errors.append(f"Invalid environment: {monitor['tags']['environment']}")
valid_criticalities = ["critical", "high", "medium", "low"]
if "criticality" in monitor["tags"]:
if monitor["tags"]["criticality"] not in valid_criticalities:
errors.append(f"Invalid criticality: {monitor['tags']['criticality']}")
return errors