Can I use UptimeRobot to monitor my .onion service?

No. UptimeRobot and similar services probe from their own servers, which cannot connect to .onion addresses. Self-hosted monitoring using Prometheus Blackbox Exporter with SOCKS5 proxy routing through a local Tor daemon is the standard approach.

Why do I see transient connection failures in monitoring even when my service is healthy?

Tor circuit establishment has inherent latency and occasional failures. Monitoring probes may fail if a new circuit is being built (circuit establishment takes 1-5 seconds). Set your monitoring tool to retry failed probes 2-3 times before marking the service as down, and require 3 consecutive failures before alerting.

How do I alert myself when my .onion service goes down without revealing my server's IP?

Route alert notifications through Tor. Configure Alertmanager (Prometheus alerting component) to send alerts via a Tor-routed webhook to a self-hosted notification endpoint accessible via .onion. Alternatively, use Signal Desktop configured to use Tor's SOCKS proxy for alert notifications.

How much monitoring overhead does Prometheus scraping add to a busy .onion service?

Prometheus scraping the application metrics endpoint every 15 seconds adds minimal overhead - typically less than 1% of CPU and network. Blackbox Exporter probes every 5 minutes add 1 Tor circuit worth of traffic. The monitoring overhead is negligible compared to serving user requests.

Should I monitor from inside the .onion service or from a separate external machine?

Both have value. Internal monitoring (same server) detects application and Tor daemon issues quickly. External monitoring (separate machine probing the .onion service) validates full end-to-end availability including Tor circuit availability and introduction point functionality. Run both when resources allow.

Monitoring and Alerting for .onion Hidden Services

Production .onion services require monitoring to detect downtime, performance degradation, and security incidents before they impact users significantly. Unlike clearnet services, standard uptime monitoring tools cannot connect through Tor to verify .onion availability. Purpose-built monitoring infrastructure that uses Tor SOCKS proxy connections is necessary for accurate uptime tracking. This guide covers practical monitoring approaches for Tor hidden services.

Need this done for your project?

We implement, you ship. Async, documented, done in days.

Start a Brief

Uptime Monitoring for .onion Services

Standard HTTP monitoring tools (UptimeRobot, Pingdom, StatusCake) cannot connect to .onion services because they do not route through Tor. Self-hosted uptime monitoring requires a monitoring server with a running Tor daemon. Prometheus Blackbox Exporter with SOCKS5 proxy configuration can probe .onion HTTP endpoints and record HTTP response time and status. Configure the Blackbox Exporter http module with proxy_url: socks5://127.0.0.1:9050 to route probes through Tor. Set scrape_interval to 5 minutes to avoid overwhelming small hidden services with monitoring traffic. Alert on probe failures persisting for 15+ minutes (circuit establishment delays cause transient failures that should not trigger pages).

Application-Level Metrics with Prometheus

Export application metrics from your hidden service to a local Prometheus endpoint (bind to 127.0.0.1:9090 to prevent external exposure). For Node.js applications, use the prom-client library. For Python (FastAPI/Flask), use prometheus_client. Expose metrics: request rate, request latency percentiles (p50, p95, p99), error rate, database query time, and active session count. Scrape this endpoint from Prometheus running on the same server or a local network. Configure Grafana dashboards visualizing these application metrics alongside system metrics from node_exporter. For particularly sensitive services where even metric endpoint existence should be private, access the Prometheus UI only through the .onion service's own circuits using a local Tor client.

Tor Circuit Quality Monitoring

Hidden service availability depends not only on application health but on Tor circuit quality. Monitor the Tor daemon's health through the control port: number of established circuits, circuit build success rate, and introduction point availability. Use the Python stem library to query these metrics programmatically and export them to Prometheus. Alert when introduction point count drops below 2 (minimum for reliable reachability) or when circuit build failure rate exceeds 10% over 5 minutes. Tor daemon memory usage should be monitored - if MaxMemInQueues is frequently triggered, the service is receiving more traffic than memory allows and circuits start dropping. Size MaxMemInQueues to 60-70% of available RAM.

Log-Based Alerting and Anomaly Detection

Nginx access logs for .onion services record all HTTP request paths, status codes, and response sizes. Since all IPs appear as 127.0.0.1, log analysis focuses on request patterns rather than IP attribution. Parse logs with Loki (log aggregation backend for Grafana) or ELK stack (Elasticsearch, Logstash, Kibana) for structured querying. Create alerts for: HTTP 5xx error rate exceeding 1% of requests over 10 minutes (application errors), response time P95 exceeding 3 seconds over 5 minutes (performance degradation), unusual request patterns to administrative paths (potential scanning), and Tor daemon errors in /var/log/tor/notices.log.

Incident Response for .onion Services

Define incident severity levels and response procedures before incidents occur. Severity 1 (total outage): immediate investigation, root cause resolution within 4 hours. Severity 2 (partial degradation): investigation within 1 hour, resolution within 8 hours. Severity 3 (performance degradation): investigation within 4 hours, resolution within 24 hours. On-call notification through Signal or Telegram (both work over Tor for alert routing) avoids using email that might be monitored. When a Tor hidden service goes down: check Tor daemon status (systemctl status tor), check introduction points (stem query), check application health (local HTTP to application port), check database connectivity, check system resources (disk, memory, CPU). Document incident timelines and root causes for post-incident review.

Privacy & anti-censorship guides

Tor in Russia 2026 Tor obfs4 Bridges Guide

Why Anubiz Host

100% async — no calls, no meetings

Delivered in days, not weeks

Full documentation included

Production-grade from day one

Security-first approach

Post-delivery support included

Ready to get started?

Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.

Start a Brief Iceland VPS I