en

Uptime Monitoring for Tor Hidden Services in 2026

Hidden service availability is harder to monitor than clearnet services because standard monitoring tools (UptimeRobot, Pingdom) cannot reach .onion addresses. Building a reliable monitoring infrastructure for .onion services requires running monitoring agents that communicate over Tor, aggregating results across multiple circuits to distinguish Tor network degradation from actual service downtime, and delivering alerts without revealing the monitoring operator's identity. This guide covers self-hosted uptime monitoring for hidden services, covering synthetic checks, alerting channels, and integrating monitoring data with your incident response workflow. For hosting providers offering .onion services, monitoring every hosted .onion from multiple vantage points is essential for maintaining SLAs and detecting silent failures before clients report them.

Need this done for your project?

We implement, you ship. Async, documented, done in days.

Start a Brief

Monitoring Architecture for .onion Services

A complete monitoring setup for hidden services involves: (1) monitoring agents distributed across multiple geographic locations and Tor circuits, (2) a central aggregation server that collects check results and computes availability, (3) an alerting engine that fires when enough agents report failure (to filter out Tor network blips), and (4) a dashboard for historical availability visualization. The monitoring agents must route all traffic through separate Tor instances (not shared Tor circuits) to ensure independent circuit-level checks. Use Python with the requests library and SocksiPy/PySocks for Tor-routed HTTP requests. Each agent runs its own Tor instance with a unique DataDirectory to avoid circuit sharing. Checks are HTTP GET requests to the .onion address - record response code, response time, and content verification (check for expected string in response body).

Multi-Circuit Check Consensus Algorithm

A single Tor circuit failure does not mean the hidden service is down - Tor circuit establishment fails frequently under normal conditions. Implement a consensus algorithm: declare the service 'down' only if 3 or more agents across different circuits report failure within a 5-minute window. One or two agent failures in a window are classified as 'Tor network degradation' and do not trigger an outage alert. This requires coordinating check timing across agents - schedule checks to run within the same 5-minute window so results can be compared. Implement exponential backoff for repeated failures: check every 5 minutes normally, every 1 minute during suspected outage. Calculate availability as: (total minutes - minutes confirmed down) / total minutes, expressed as percentage. A 5-minute check interval provides 99.65% maximum measurable availability (8,760 checks per year, each covering 1 minute).

Alert Delivery Channels That Preserve Privacy

Delivering alerts without revealing the monitoring operator's identity requires privacy-preserving alert channels. Options: (1) Tor-routed email via SMTP over Tor - send alert emails through a .onion SMTP server or via Tor exit to a clearnet SMTP service, (2) Matrix or XMPP over Tor - send alert messages to a monitored room/chat, (3) Webhook to a .onion receiver - post JSON alert payloads to a .onion webhook endpoint, (4) Tor-routed PagerDuty API calls (PagerDuty's API is reachable over Tor via exit relays). For on-call rotation where multiple people share monitoring responsibility, a Matrix room over Tor provides group alerting with message history. Escalation logic: alert after 2 confirmed checks down, escalate to additional contacts after 10 minutes without acknowledgment.

Content Verification and Functional Monitoring

HTTP 200 response is necessary but not sufficient for 'service is working' - a broken backend can return 200 with an error page. Implement content verification: after receiving a 200 response, check the response body for expected strings (page title, specific content markers, API response fields). For dynamic services, check for the absence of error indicators ('500 Internal Server Error', 'Database connection failed', 'maintenance mode') rather than presence of specific content. For API hidden services, make test API calls with known inputs and verify expected outputs. For login-protected hidden services, implement synthetic user login (store test credentials securely, execute login flow, check for authenticated state). This catches authentication failures, database disconnection, and business logic errors that HTTP status codes miss.

SLA Reporting and Historical Availability

Generate monthly SLA reports from monitoring data: calculate uptime percentage, MTTD (mean time to detect), MTTR (mean time to resolve), and incident count. Store monitoring results in a time-series database (InfluxDB, TimescaleDB, or Prometheus with long retention) for historical analysis. Grafana with the Tor-proxied InfluxDB datasource provides visualization accessible from Tor Browser. For customer-facing SLA commitments, define the methodology clearly: minimum 3 agents agree, 5-minute resolution, which endpoints are checked. Publish a public status page as a separate .onion site - a simple Nginx static site updated by a cron script that writes availability percentages from the monitoring database. The status page .onion address should be different from the monitored service.

Why Anubiz Host

100% async — no calls, no meetings
Delivered in days, not weeks
Full documentation included
Production-grade from day one
Security-first approach
Post-delivery support included

Ready to get started?

Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.

Anubiz Chat AI

Online