en
Multi-Instance Hidden Service Architecture: High Availability Design
Production hidden services serving active user communities require high-availability architecture that survives individual instance failures. This guide covers multi-instance hidden service design using OnionBalance, shared state management, and automatic failover for zero-downtime operations.
Need this done for your project?
We implement, you ship. Async, documented, done in days.
High Availability Requirements for Production Hidden Services
Single-instance hidden services fail when the server has hardware issues, network problems, DDoS attacks, or software crashes. For services where availability is critical (secure drop platforms, active community forums, essential privacy tools), single-instance design is insufficient. High availability requires: multiple backend instances serving the same .onion address, automatic failover when an instance becomes unavailable, shared state (database, session storage) across instances, and monitoring with alerting for failure events. The architecture must handle graceful degradation - partial instance failure should result in reduced capacity, not total outage.
OnionBalance Configuration for Multi-Backend
OnionBalance manages multiple hidden service backends behind a single .onion address by aggregating their introduction points. Deploy 2-3 backend servers, each running Tor with its own hidden service key (different .onion address). OnionBalance (running on a management server) periodically fetches introduction points from each backend and publishes an aggregated descriptor under the master .onion address. When a backend fails, its introduction points expire from the aggregated descriptor within 5-10 minutes. Clients connecting to the master .onion address may receive introduction points from any backend, providing load distribution and failover.
Shared Database for Stateful Services
Multi-instance hidden services require shared state. A PostgreSQL or MySQL database accessible from all backend instances stores application state. Options: (1) Primary-replica database cluster (one write primary, multiple read replicas) - provide all backends write access through the primary, read access through replicas for scalability. (2) Multi-primary (active-active) database - higher complexity but no write bottleneck. (3) Managed database service (if the provider is trusted and the database has no sensitive content). The database server should be accessible only on the internal Docker network (not from the internet), with backends connecting via internal network. Database access should use TLS encryption even on internal networks.
Session Storage and State Replication
Web application session storage must be shared across instances. If a user's request is served by instance A and the next by instance B, the session data must be available on B. Solutions: Redis cluster (all backends write sessions to shared Redis, fast retrieval from any backend), sticky sessions in load balancer (but load balancing for hidden services is circuit-based, not IP-based, making sticky sessions impractical), or stateless JWTs that contain all session state (no shared storage required, but increased token size). Redis with sentinel (for automatic leader election) provides high-availability session storage for stateful multi-instance deployments.
Deployment and Update Procedures for Multi-Instance
Rolling deployments update one instance at a time, maintaining service availability throughout. Procedure: (1) Take instance 1 out of rotation (stop Tor daemon to allow introduction points to expire), (2) Update and verify instance 1, (3) Bring instance 1 back into rotation, (4) Repeat for subsequent instances. Zero-downtime database migrations require careful handling: deploy code that reads both old and new schema (backward-compatible), run migration to add new columns/tables, deploy code using new schema, run migration to remove old elements. Blue-green deployment (two complete environments, switch traffic atomically) provides cleaner cutover but requires double the infrastructure.
Related Services
Why Anubiz Host
100% async — no calls, no meetings
Delivered in days, not weeks
Full documentation included
Production-grade from day one
Security-first approach
Post-delivery support included
Ready to get started?
Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.