en

Tor Congestion Control: KIST, Flow Windows, and Circuit Performance

Tor's performance has historically been limited by naive flow control that allowed high-bandwidth streams to monopolize relay resources, degrading latency for all users. Modern Tor versions include congestion control improvements. This guide explains how Tor manages congestion and what it means for relay and hidden service operators.

Need this done for your project?

We implement, you ship. Async, documented, done in days.

Start a Brief

Cell-Based Flow Control: The Foundation

Tor sends data in fixed-size 512-byte cells. Each circuit has a cell window (the number of cells that can be in transit before acknowledgment is required). The sender reduces its window when cells are sent and increases it when acknowledging cells are received. This prevents any single circuit from overwhelming relay queues but creates a credit-based flow control system. Original window size: 1000 cells (512 KB of data in flight). The fixed window size was inefficient - on high-latency circuits, the window would exhaust before acknowledgment arrived, causing artificial bandwidth limitation. Modern Tor (0.4.7+) implements improved window management based on measured RTT.

KIST: Kernel-Informed Socket Transport

KIST (Kernel-Informed Socket Transport) is a scheduling algorithm that improves Tor relay fairness by querying the kernel about each socket's available send buffer space before writing data. Before KIST: Tor relays wrote data to sockets in round-robin, leading to some sockets receiving data they could not immediately send (filling kernel buffers, causing head-of-line blocking for other circuits). With KIST: Tor only writes to a socket when the kernel's send buffer has space, preventing buffer bloat and improving circuit isolation. Configuration: KISTSchedRunIntervalMsec 10 (frequency of KIST scheduling runs, default 10ms). KIST is enabled by default in Tor 0.3.4+ and provides the best performance on relay operators without manual configuration.

End-to-End Congestion Control (Tor 0.4.7+)

Tor 0.4.7 introduced end-to-end congestion control as a major performance improvement. The new system replaces fixed cell windows with bandwidth-delay product (BDP) estimation: the circuit window is set to the estimated number of cells that can be in flight based on measured RTT and available bandwidth. This allows high-bandwidth, low-latency circuits to use larger windows (more in-flight data, better throughput) while keeping high-latency or constrained circuits at appropriate smaller windows. Result: better utilization of available bandwidth on fast circuits, reduced latency for interactive traffic, and improved fairness between high-bandwidth and low-bandwidth circuits. Operators running Tor 0.4.7+ get this improvement automatically.

Impact of Congestion on Hidden Services

Hidden services are particularly sensitive to congestion because they use 6-hop circuits (3 more hops than clearnet access). Each additional hop adds potential congestion points. Under congestion: introduction point flooding (malicious or high-volume traffic to the introduction points) overwhelms the hidden service's ability to handle connection setup. Circuit building failure rates increase during high congestion. Responses to congestion: Tor's Proof of Work (PoW) mechanism rate-limits introduction requests under load. OnionBalance distributes load across multiple backends reducing per-server congestion. Horizontal scaling (more servers behind OnionBalance) is the most effective response to sustained high load that causes service degradation.

Monitoring and Diagnosing Congestion on Tor Relays

Relay operators can monitor congestion indicators through Tor's metrics and logs. Key metrics: RelayBandwidthStats in Tor logs shows per-interval bandwidth and can reveal consistent bandwidth under-utilization (indicating congestion or client-side limits). The Tor Metrics portal shows relay bandwidth history - dips during specific periods may indicate congestion-related throttling. Queue depth in Prometheus tor-exporter metrics reveals when relay buffers are filling. Circuit establishment failure rates (logged at notice level) indicate when congestion is severe enough to prevent circuit building. For high-bandwidth relays, monitoring CPU usage: if CPU is not saturating but bandwidth is below configured rate, network or kernel buffer limitations may be the bottleneck.

Why Anubiz Host

100% async — no calls, no meetings
Delivered in days, not weeks
Full documentation included
Production-grade from day one
Security-first approach
Post-delivery support included

Ready to get started?

Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.

Anubiz Chat AI

Online