Privacy-Preserving Log Analysis for Tor Hidden Service Operators
Logs are essential for debugging, performance analysis, and security incident investigation. For Tor hidden services, the standard approach to logging creates a privacy conflict: Nginx access logs record the source IP of each request, but for hidden services all requests come from 127.0.0.1 (the Tor daemon loopback) - so standard IP-based analysis provides no useful information about client distribution while still creating log files that could be subpoenaed. The appropriate logging strategy for .onion services collects the data needed for operations while minimizing the creation of user-tracking data. This guide covers designing privacy-preserving log formats, configuring Nginx to log useful operational data for .onion services, log retention policies, and tools for analyzing logs without correlating data to individual users.
Need this done for your project?
We implement, you ship. Async, documented, done in days.
For .onion services, design logs around operational needs rather than default web server behavior. Log what is operationally necessary: timestamp (for correlation with incidents), HTTP status code (to detect errors and attacks), request path (for traffic analysis, without query parameters that may contain sensitive data), response size (for bandwidth analysis), response time (for performance analysis), User-Agent string (optional - useful for detecting non-Tor-Browser clients). Do not log: the source IP (always 127.0.0.1, useless), query parameters (may contain session tokens, search terms, or other sensitive data), request bodies (contain form data, sensitive user input), or Referer headers (can reveal browsing history). Configure Nginx with a custom log format that omits $remote_addr and $http_referer: log_format onion_format '$time_iso8601 $status $request_method $uri $body_bytes_sent ${request_time}ms $http_user_agent';
Tor Daemon Logging Configuration
The Tor daemon has its own logging configuration separate from Nginx. In torrc, configure log level appropriately: Log notice file /var/log/tor/notices.log for production (logs important events without verbose debug output). The notice-level log records: hidden service descriptor publications (success/failure), introduction point establishment, circuit creation events, and Tor network connectivity issues. Avoid Log debug in production as debug logs contain circuit-level detail that could be used to correlate activity timing. For incident investigation, temporarily increase log level: Log info file /var/log/tor/info.log for more detail during active troubleshooting, then revert to notice. Enable SafeLogging 1 (the default) to replace sensitive values in logs with [scrubbed] placeholders. Retain Tor daemon logs for 30 days maximum - they contain timing information for hidden service descriptor events.
Application-Level Logging for Debugging
Application logs (Django, Node.js, PHP, etc.) should be configured with similar privacy principles: log exceptions and errors with stack traces (for debugging), log slow queries above a threshold (for performance tuning), log authentication failures with timestamps (for security monitoring), and log request processing time without user-identifying data. Remove user-identifying data from application logs: hash username or session ID before logging (use HMAC-SHA256 with a per-server secret so the hash is useful for correlation within the same server but not linkable across servers or time periods). Format: log.info('Request processed', { user_hash: hmac('user-id', server_secret), duration_ms: 45, endpoint: '/api/search' }). This allows correlating events within a session for debugging without storing raw user identifiers.
Log Analysis Tools and Dashboards
GoAccess is a terminal-based log analyzer that processes Nginx access logs and generates reports - configure it with your custom onion_format: goaccess /var/log/nginx/access.log --log-format='%d %T %s %m %U %b %L %^' (adjust field order to match your format). GoAccess provides real-time dashboards in terminal or HTML output without sending data to external services. For time-series metrics, use a Prometheus + Nginx Vhost Exporter setup: the exporter parses Nginx logs and exposes Prometheus-format metrics (request rate by status code, response time percentiles) on a local port. Prometheus scrapes the exporter on the loopback interface. Visualize in Grafana running as another hidden service. This provides production-quality dashboards without exposing metrics endpoints publicly.
Log Retention Policy and Secure Deletion
Define and implement a log retention policy: application error logs: 90 days (needed for debugging recurring issues), Nginx access logs: 30 days (needed for traffic analysis and incident investigation), Tor daemon logs: 30 days (needed for hidden service operational issues), authentication/security logs: 180 days (needed for security incident investigation). Automate retention with logrotate: configure /etc/logrotate.d/nginx to rotate daily and keep 30 days of rotated logs. After retention period, delete with secure overwrite: shred -u /var/log/old-logs/*.log.gz. For very sensitive deployments, log to a tmpfs (RAM-based filesystem) mount: logs exist only in memory and are lost on reboot. Configure tmpfs in /etc/fstab: tmpfs /var/log/nginx tmpfs size=512M,noatime 0 0. This trades log durability for erasure-on-reboot.