en
Monitoring and Alerting for .onion Hidden Services
Production .onion services require monitoring to detect downtime, performance degradation, and security incidents before they impact users significantly. Unlike clearnet services, standard uptime monitoring tools cannot connect through Tor to verify .onion availability. Purpose-built monitoring infrastructure that uses Tor SOCKS proxy connections is necessary for accurate uptime tracking. This guide covers practical monitoring approaches for Tor hidden services.
Need this done for your project?
We implement, you ship. Async, documented, done in days.
Uptime Monitoring for .onion Services
Standard HTTP monitoring tools (UptimeRobot, Pingdom, StatusCake) cannot connect to .onion services because they do not route through Tor. Self-hosted uptime monitoring requires a monitoring server with a running Tor daemon. Prometheus Blackbox Exporter with SOCKS5 proxy configuration can probe .onion HTTP endpoints and record HTTP response time and status. Configure the Blackbox Exporter http module with proxy_url: socks5://127.0.0.1:9050 to route probes through Tor. Set scrape_interval to 5 minutes to avoid overwhelming small hidden services with monitoring traffic. Alert on probe failures persisting for 15+ minutes (circuit establishment delays cause transient failures that should not trigger pages).
Application-Level Metrics with Prometheus
Export application metrics from your hidden service to a local Prometheus endpoint (bind to 127.0.0.1:9090 to prevent external exposure). For Node.js applications, use the prom-client library. For Python (FastAPI/Flask), use prometheus_client. Expose metrics: request rate, request latency percentiles (p50, p95, p99), error rate, database query time, and active session count. Scrape this endpoint from Prometheus running on the same server or a local network. Configure Grafana dashboards visualizing these application metrics alongside system metrics from node_exporter. For particularly sensitive services where even metric endpoint existence should be private, access the Prometheus UI only through the .onion service's own circuits using a local Tor client.
Tor Circuit Quality Monitoring
Hidden service availability depends not only on application health but on Tor circuit quality. Monitor the Tor daemon's health through the control port: number of established circuits, circuit build success rate, and introduction point availability. Use the Python stem library to query these metrics programmatically and export them to Prometheus. Alert when introduction point count drops below 2 (minimum for reliable reachability) or when circuit build failure rate exceeds 10% over 5 minutes. Tor daemon memory usage should be monitored - if MaxMemInQueues is frequently triggered, the service is receiving more traffic than memory allows and circuits start dropping. Size MaxMemInQueues to 60-70% of available RAM.
Log-Based Alerting and Anomaly Detection
Nginx access logs for .onion services record all HTTP request paths, status codes, and response sizes. Since all IPs appear as 127.0.0.1, log analysis focuses on request patterns rather than IP attribution. Parse logs with Loki (log aggregation backend for Grafana) or ELK stack (Elasticsearch, Logstash, Kibana) for structured querying. Create alerts for: HTTP 5xx error rate exceeding 1% of requests over 10 minutes (application errors), response time P95 exceeding 3 seconds over 5 minutes (performance degradation), unusual request patterns to administrative paths (potential scanning), and Tor daemon errors in /var/log/tor/notices.log.
Incident Response for .onion Services
Define incident severity levels and response procedures before incidents occur. Severity 1 (total outage): immediate investigation, root cause resolution within 4 hours. Severity 2 (partial degradation): investigation within 1 hour, resolution within 8 hours. Severity 3 (performance degradation): investigation within 4 hours, resolution within 24 hours. On-call notification through Signal or Telegram (both work over Tor for alert routing) avoids using email that might be monitored. When a Tor hidden service goes down: check Tor daemon status (systemctl status tor), check introduction points (stem query), check application health (local HTTP to application port), check database connectivity, check system resources (disk, memory, CPU). Document incident timelines and root causes for post-incident review.
Related Services
Why Anubiz Host
100% async — no calls, no meetings
Delivered in days, not weeks
Full documentation included
Production-grade from day one
Security-first approach
Post-delivery support included
Ready to get started?
Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.