Figma's FigCache: Next-Generation Data Caching at Scale¶
Source: Figma Blog
Author: Figma Engineering
TL;DR¶
As Redis became critical-path infrastructure at Figma, it hit scalability and reliability limits — connection bottlenecks, thundering herds, observability gaps, and data pollution risks. Figma built FigCache: a stateless, RESP-wire-protocol proxy acting as a unified Redis data plane. The architecture features a decoupled frontend/backend design, configuration-driven routing via Starlark, and custom extensions to the Redis protocol. Since its rollout to Figma's main API service in H2 2025, the caching layer has achieved six nines of uptime.
The Problem¶
As Figma grew, Redis evolved from a non-critical dependency to a critical-path component with structural challenges:
| Issue | Impact |
|---|---|
| Connection limits | Clusters approaching hard limits |
| Thundering herds | Rapid scale-ups caused massive connection bottlenecks |
| Data pollution | No centralised traffic management — apps could corrupt data across clusters |
| Observability gaps | Fragmented client libraries; slow incident diagnosis |
| Failover correctness | No fleet-wide guarantees about state consistency |
Initial mitigations (removing Redis from some API subsystems, localised client-side connection pooling) bought time but were not a strategic solution.
The Solution: FigCache Architecture¶
Why Build vs. Buy¶
Existing open-source proxies had critical shortcomings: - Could not extract full annotated arguments from arbitrary Redis commands (needed for semantic guardrails) - Could not extend the Redis protocol with custom commands - Required maintaining brittle source code forks to add business logic
Decoupled Proxy Architecture¶
FigCache separates two concerns:
| Layer | Responsibility |
|---|---|
| Frontend | Client interaction, RESP-based RPC (ResPC), network I/O, connection management, structured command parsing |
| Backend | Command processing, connection multiplexing to storage backends, physical execution |
ResPC Framework — portmanteau of RESP + RPC — is a Go library providing an RPC framework over the Redis wire protocol.
Configuration-Driven Engine Tree¶
The backend layer is a dynamically-assembled tree of engine nodes:
- Leaf nodes (Data Engines): Execute commands against Redis
- Intermediate nodes (Filter Engines): Route, block, or modify commands before execution
- Configuration expressed as Starlark programs evaluated at runtime, rendering a Protobuf-structured config
This enables complex behaviours (command-type splitting, key-prefix routing) without deploying new server binaries.
Key Features¶
| Feature | Description |
|---|---|
| Fanout Engine | Transparently parallelises multi-shard pipelines, resolving CROSSSLOT errors |
| Custom Commands | Language-agnostic distributed locking (over Redlock); protocol-native graceful draining |
| Fake Cluster Mode | Emulation layer handling fragmented client configs for easier migration |
| Starlark Config | Runtime-programmable routing without server rebuilds |
Results¶
- Six nines (99.9999%) uptime since rollout to Figma's main API service (H2 2025)
- Centralised observability and traffic management
- Dramatically reduced connection pressure on upstream Redis clusters
Key Takeaways¶
- When Redis becomes critical-path infrastructure, connection management and traffic isolation become existential problems
- A stateless proxy layer decouples client scaling from backend capacity, solving thundering herd and connection limit issues
- Configuration-driven (Starlark) architecture enables complex routing logic without deploying new binaries
- Custom protocol extensions (locking, draining) were only possible by building an in-house proxy — no open-source solution provided the needed extensibility
- Six nines is an exceptional reliability result for a caching infrastructure layer