How many backend instances do I need for a highly available hidden service?

Three is the minimum for true high availability with rolling updates possible while maintaining adequate capacity. With three instances: one can be unavailable (failure or maintenance) while two serve traffic. With two instances, any single failure means 50% capacity reduction. For critical services, three to five instances provides good balance of availability and cost.

Does OnionBalance introduce a single point of failure?

Yes - the OnionBalance management process is a single point. If it fails, existing cached descriptors serve users for 1-2 hours (TTL) but are not refreshed. Mitigate by: monitoring the OnionBalance process with automatic restart, running OnionBalance on a separate management VPS from the backend servers, and having a procedure to restart OnionBalance quickly if it fails.

How do I monitor which backend is serving each request?

Add a custom HTTP response header (X-Backend-ID) that identifies the backend instance. Log aggregation (centralizing logs from all backends in Elasticsearch or Grafana Loki) provides a unified view. Monitor per-backend request rates in Prometheus to verify traffic distribution. OnionBalance's logs show which backends it is advertising introduction points for.

What happens during database primary failure in a multi-instance setup?

Without replica promotion, all instances that write to the failed primary will error. With PostgreSQL + Patroni or similar automatic failover, a replica is promoted to primary within 30-60 seconds and instances reconnect. Configure application connection pooling (PgBouncer) to handle reconnection after failover. Test failover procedures periodically to verify recovery time matches expectations.

Is the cost of multi-instance deployment justified for small hidden services?

For small communities (under 100 active users) with moderate availability requirements, a single well-maintained instance is usually appropriate. Multi-instance architecture is justified when: downtime creates significant user impact (security communication tools, news in censored countries), when single-instance DDoS attacks have occurred, or when the service generates revenue that makes availability financially important. Start with single instance and scale when demand justifies the cost.

Multi-Instance Hidden Service Architecture: High Availability Design

Production hidden services serving active user communities require high-availability architecture that survives individual instance failures. This guide covers multi-instance hidden service design using OnionBalance, shared state management, and automatic failover for zero-downtime operations.

Need this done for your project?

We implement, you ship. Async, documented, done in days.

Start a Brief

High Availability Requirements for Production Hidden Services

Single-instance hidden services fail when the server has hardware issues, network problems, DDoS attacks, or software crashes. For services where availability is critical (secure drop platforms, active community forums, essential privacy tools), single-instance design is insufficient. High availability requires: multiple backend instances serving the same .onion address, automatic failover when an instance becomes unavailable, shared state (database, session storage) across instances, and monitoring with alerting for failure events. The architecture must handle graceful degradation - partial instance failure should result in reduced capacity, not total outage.

OnionBalance Configuration for Multi-Backend

OnionBalance manages multiple hidden service backends behind a single .onion address by aggregating their introduction points. Deploy 2-3 backend servers, each running Tor with its own hidden service key (different .onion address). OnionBalance (running on a management server) periodically fetches introduction points from each backend and publishes an aggregated descriptor under the master .onion address. When a backend fails, its introduction points expire from the aggregated descriptor within 5-10 minutes. Clients connecting to the master .onion address may receive introduction points from any backend, providing load distribution and failover.

Shared Database for Stateful Services

Multi-instance hidden services require shared state. A PostgreSQL or MySQL database accessible from all backend instances stores application state. Options: (1) Primary-replica database cluster (one write primary, multiple read replicas) - provide all backends write access through the primary, read access through replicas for scalability. (2) Multi-primary (active-active) database - higher complexity but no write bottleneck. (3) Managed database service (if the provider is trusted and the database has no sensitive content). The database server should be accessible only on the internal Docker network (not from the internet), with backends connecting via internal network. Database access should use TLS encryption even on internal networks.

Session Storage and State Replication

Web application session storage must be shared across instances. If a user's request is served by instance A and the next by instance B, the session data must be available on B. Solutions: Redis cluster (all backends write sessions to shared Redis, fast retrieval from any backend), sticky sessions in load balancer (but load balancing for hidden services is circuit-based, not IP-based, making sticky sessions impractical), or stateless JWTs that contain all session state (no shared storage required, but increased token size). Redis with sentinel (for automatic leader election) provides high-availability session storage for stateful multi-instance deployments.

Deployment and Update Procedures for Multi-Instance

Rolling deployments update one instance at a time, maintaining service availability throughout. Procedure: (1) Take instance 1 out of rotation (stop Tor daemon to allow introduction points to expire), (2) Update and verify instance 1, (3) Bring instance 1 back into rotation, (4) Repeat for subsequent instances. Zero-downtime database migrations require careful handling: deploy code that reads both old and new schema (backward-compatible), run migration to add new columns/tables, deploy code using new schema, run migration to remove old elements. Blue-green deployment (two complete environments, switch traffic atomically) provides cleaner cutover but requires double the infrastructure.

Privacy & anti-censorship guides

Tor in Russia 2026 Tor obfs4 Bridges Guide

Why Anubiz Host

100% async — no calls, no meetings

Delivered in days, not weeks

Full documentation included

Production-grade from day one

Security-first approach

Post-delivery support included

Ready to get started?

Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.

Start a Brief Iceland VPS III