Skip to content
Docs

Benchmarks

Head-to-head comparison of T4 and etcd, run from the repository benchmark harness in Docker on the same host.

Date2026-04-12
HostApple Silicon Mac (12 vCPUs, 14 GB RAM assigned to Docker)
DockerVirtioFS-backed volumes, Linux VM via macOS Virtualization.framework
T4Pebble storage, group-commit write pipeline
etcdv3.6.4, bbolt storage
Clientt4bench — etcd v3 Go client, one TCP connection per worker
Ops50,000 total per workload
Parallel workers16 (for par-* and mixed workloads)
Key size64 bytes
Value size256 bytes

Note: All numbers reflect Docker-on-macOS performance. Every fsync crosses the macOS -> hypervisor -> Linux VM boundary, adding overhead compared to native Linux. Relative comparisons between T4 and etcd are meaningful; absolute numbers are not representative of production Linux performance.


WorkloadT4 ops/setcd ops/sRatioT4 p50etcd p50
seq-put1,1804,5320.26×473 µs188 µs
par-put (16 workers)5,93513,2710.45×1,951 µs1,122 µs
seq-get7,3999,5630.77×116 µs99 µs
par-get (16 workers)36,52426,0231.40×369 µs297 µs
mixed (16 workers, 50/50 r/w)11,69716,4080.71×693 µs867 µs
watch2,1134,3560.49×457 µs200 µs

Single node — synchronous S3 durability (single-s3 scenario)

Section titled “Single node — synchronous S3 durability (single-s3 scenario)”
WorkloadT4 ops/setcd ops/sRatioT4 p50etcd p50
seq-put6423,0110.21×1,494 µs200 µs
par-put (16 workers)4,32710,5320.41×3,200 µs1,033 µs
seq-get8,1519,9400.82×116 µs97 µs
par-get (16 workers)36,42826,8161.36×376 µs296 µs
mixed (16 workers, 50/50 r/w)5,33712,8810.41×1,771 µs875 µs
watch4793,7490.13×1,467 µs198 µs

T4 cluster nodes use MinIO for leader election and WAL/checkpoint archival; etcd uses a standard 3-node raft cluster.

WorkloadT4 ops/setcd ops/sRatioT4 p50etcd p50T4 p999etcd p999
seq-put1,4402,0000.72×579 µs467 µs4,659 µs2,157 µs
par-put (16 workers)8,6097,3781.17×1,646 µs2,074 µs13,421 µs9,525 µs
seq-get5,1943,6801.41×199 µs269 µs961 µs1,353 µs
par-get (16 workers)23,90513,0391.83×438 µs687 µs27,859 µs48,195 µs
mixed (16 workers)9,8677,9421.24×1,241 µs1,725 µs20,512 µs24,713 µs
watch1,6351,8950.86×576 µs491 µs2,900 µs2,826 µs

In both single and single-s3, etcd remains clearly faster for seq-put, par-put, mixed, and watch. On this host, T4’s single-node write path is still the main performance gap relative to etcd.

Single-node reads: T4 scales better on parallel gets

Section titled “Single-node reads: T4 scales better on parallel gets”

T4 is slightly slower on seq-get, but clearly faster on par-get in both single-node scenarios. That suggests the current T4 server + Pebble path scales well under parallel readers even when it does not win the single-request latency floor.

Comparing single to single-s3, T4 read throughput stays nearly flat while write-heavy workloads slow down:

  • seq-put: 1,180 -> 642 ops/s
  • par-put: 5,935 -> 4,327 ops/s
  • mixed: 11,697 -> 5,337 ops/s
  • watch: 2,113 -> 479 ops/s

This is the expected cost of placing synchronous WAL upload on the write durability path.

Cluster results are much stronger for T4 than the previous published run

Section titled “Cluster results are much stronger for T4 than the previous published run”

At 16 clients, T4 wins par-put, seq-get, par-get, and mixed in the 3-node cluster scenario. etcd still leads on seq-put and is slightly ahead on watch, but the gap is small there compared to the earlier numbers.

The updated headline is:

  • cluster reads are better on T4
  • cluster mixed traffic is better on T4
  • cluster parallel writes are better on T4
  • single-node writes are still better on etcd

Use caseRecommendation
Single-node write-heavy workloadsetcd is faster in this run
Single-node read-heavy parallel workloadsT4 is competitive and faster on par-get
Single-node with remote durabilityT4 works, but synchronous S3 upload has a visible write cost
3-node cluster, read-heavy workloadsT4 is faster in this run
3-node cluster, mixed / parallel trafficT4 is faster in this run
3-node cluster, sequential writesetcd still leads slightly in this run
Survive total single-node destructionOnly T4 with --wal-sync-upload=true
Embedded library in a Go binaryOnly T4

Terminal window
# Build images once
docker build -f bench/Dockerfile.t4 -t t4-bench .
docker build -f bench/Dockerfile -t t4bench .
# Run all scenarios
./bench/run.sh
# Or a subset
./bench/run.sh single
./bench/run.sh single-s3
./bench/run.sh cluster

The raw results from the latest run are written to bench/results/results.jsonl.