en

Tor Relay Monitoring with Prometheus and Grafana: Complete Setup Guide

Operating a Tor relay without monitoring is operating blind - you may miss bandwidth throttling by your VPS provider, relay shutdown due to resource exhaustion, or declining relay consensus weight that indicates a problem. Prometheus paired with Grafana provides production-grade observability for Tor relays: time-series metrics from node_exporter (system resources), custom Tor metrics via the Tor ControlPort, and dashboards that give you immediate visibility into relay performance and health. This guide covers setting up Prometheus and Grafana on a relay server or central monitoring host, configuring Tor-specific metrics collection, building useful dashboards, and setting up alerts for relay health issues.

Need this done for your project?

We implement, you ship. Async, documented, done in days.

Start a Brief

Prometheus and node_exporter Installation

Install Prometheus for metrics collection and node_exporter for system metrics. On Debian/Ubuntu: apt install prometheus prometheus-node-exporter. The Prometheus server scrapes metrics endpoints on a configured interval (default 15s). node_exporter exposes system metrics at localhost:9100/metrics including: CPU usage, memory, disk I/O, network interface statistics (bytes in/out per interface), file descriptor counts, and load averages. For relay monitoring, network metrics are primary: node_network_receive_bytes_total and node_network_transmit_bytes_total per interface give actual bandwidth usage regardless of Tor's internal measurement. Configure Prometheus scrape target: in prometheus.yml add a job_name 'node' with static_configs targets ['localhost:9100']. Verify: prometheus --version should show the installed version, and curl localhost:9100/metrics should return metric output.

Tor ControlPort Metrics Collection

Tor exposes operational data through its ControlPort (TCP port 9051 or 9052 for relay configurations). A custom exporter or the tor-exporter tool (available as a Python package) queries the ControlPort and exposes Prometheus metrics. Install tor-exporter: pip3 install tor-exporter. Configure torrc to enable ControlPort: ControlPort 9051 and HashedControlPassword (generate with tor --hash-password yourpassword). Run tor-exporter as a service. Key metrics exposed: tor_bandwidth_written_bytes and tor_bandwidth_read_bytes (relay-reported bandwidth), tor_consensus_weight (relay's weight in the Tor directory consensus), tor_flags (current flags: Guard, HSDir, Stable, Running), tor_circuit_count, and tor_or_conn_count. These Tor-native metrics complement node_exporter's system-level view.

Grafana Dashboard Configuration

Install Grafana: apt install grafana. Enable and start: systemctl enable grafana-server && systemctl start grafana-server. Access at localhost:3000 (default credentials admin/admin). Add Prometheus as a data source: Configuration > Data Sources > Add data source > Prometheus, set URL to http://localhost:9090. Create a relay performance dashboard with key panels: (1) Bandwidth: rate(node_network_receive_bytes_total[5m]) * 8 converted to bits/second for a real-time bandwidth graph. (2) CPU: rate(node_cpu_seconds_total{mode='user'}[5m]) shows user-space CPU consumption. (3) Memory: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 for available memory percentage. (4) Consensus weight: tor_consensus_weight over time - a declining trend indicates relay health issues. (5) Tor flags: tor_flags showing whether Guard, HSDir, Stable flags are currently set.

Alert Configuration for Relay Health

Configure Prometheus alerting rules to notify on relay health issues. Create /etc/prometheus/relay_alerts.yml: ALERT RelayBandwidthLow IF rate(node_network_receive_bytes_total[1h]) < 50000 FOR 30m (alert when relay bandwidth drops below 400 Kbit/s for 30 minutes), ALERT RelayDown IF up{job='node'} == 0 FOR 5m (alert when node_exporter is unreachable - indicates server or service problem), ALERT HighCPU IF rate(node_cpu_seconds_total{mode='user'}[5m]) > 0.9 FOR 15m, ALERT LowMemory IF node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes < 0.1 FOR 5m. Configure Alertmanager with a webhook or email notification channel. For relay operators, a Telegram bot webhook is practical: Alertmanager supports webhook_config that posts to Telegram's Bot API.

Multi-Relay Monitoring Architecture

For relay families with multiple servers, centralize Prometheus on a dedicated monitoring host or one of the relay servers. Configure each relay to run node_exporter and tor-exporter, accessible on localhost. Configure Prometheus to scrape each relay's metrics: add static_configs entries for each relay's address. Access the relay metrics from the central Prometheus server by either: (1) Exposing node_exporter and tor-exporter on the relay's public IP (filtered by firewall to only allow the Prometheus server's IP). (2) Using Prometheus federation (each relay runs a local Prometheus, the central server federates). (3) Using Prometheus's remote_write to push metrics from each relay to a central Prometheus. Option 1 is simplest but requires opening ports. Option 3 requires only outbound connectivity from each relay to the monitoring server. Build a fleet-level dashboard comparing bandwidth, consensus weight, and health across all relays simultaneously.

Why Anubiz Host

100% async — no calls, no meetings
Delivered in days, not weeks
Full documentation included
Production-grade from day one
Security-first approach
Post-delivery support included

Ready to get started?

Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.

Anubiz Chat AI

Online