System Architecture
This chapter maps the Cirrus CDN system architecture end-to-end, covering deployment topology, component responsibilities, runtime dependencies, and data flows between subsystems. The descriptions below are derived directly from the source modules, including control-plane/src/cirrus/app.py, control-plane/src/cirrus/celery_app.py, control-plane/src/cirrus/cname/, and openresty/.
High‑Level Architecture
Redis acts as the central state store and pub/sub bus for all tiers. OpenResty reads per-domain configuration (cdn:dom:{domain}) and TLS assets (cdn:cert:{domain}) from Redis and subscribes to cdn:purge for cache invalidation.
Control vs Data Plane
| Control Plane | Data Plane |
|---|---|
| Configuration store (Redis) | Edge cache and routing |
| Pub/sub + task queue (Redis + Celery) | Health checks and self-heal |
| DNS scheduler/zone builder | Logs and Prometheus metrics exporter |
High Availability
- Multi-region active-active with independent edge nodes and DNS replicas
- Zero-downtime rollouts via templated Compose/Ansible (blue/green or rolling)
- Automated failover: NOTIFY-driven secondaries and health-aware node activation
- Self-healing loops: Celery health checks adjust
activestate and trigger zone rebuilds
Deployment Topology
- Local (docker-compose)
- Production (Ansible)
docker-compose.yml orchestrates services:
- redis: Primary data store with AOF persistence.
- api: FastAPI server (serves REST + static Next.js export).
- openresty: Edge proxy on host network exposing 8080/8443; Prometheus on 9145.
- worker and beat: Celery worker/beat using Redis broker/backend.
- caddy: Local ACME directory (development CA).
- acmedns: acme-dns authority for DNS-01 challenges.
- nsd: Authoritative DNS slave receiving NOTIFY from hidden master.
- prometheus, grafana, fakedns: ops support.
Services api, worker, beat, caddy, acmedns, and nsd share the acmetest bridge network with static IPs for predictable trust and routing.
- Multi-host deployments via
ansible/with inventory-driven variables. - Separate scaling of API, workers, and edge nodes; NSD replicas for regional redundancy.
- Externalized Redis (managed/clustered) and certificate storage hardening.
- Blue/green or rolling updates; health checks gate activation in DNS.
Build & Runtime Artefacts
Python & Frontend Image (Dockerfile)
- Stage 1 builds the Next.js frontend with
pnpm build, caching.nextartifacts. - Stage 2 (
ghcr.io/astral-sh/uv:python3.13-trixie-slim) installs Python deps viauv sync(respectsuv.lock), bundles the static export under/app/static, and usesdocker/entrypoint.shto patch CA trust and launch Uvicorn.
OpenResty Image (openresty/Dockerfile)
- Builder installs
lua-resty-http,nginx-lua-prometheus, and compilesnginx_cache_multipurge. - Templater renders
nginx.conffromopenresty/conf/nginx.conf.j2(ports, resolver, cache sizing, metrics ACL). - Runtime seeds a dummy TLS cert and loads Lua:
access_router.lua(routing/caching),ssl_loader.lua(SNI-time cert load from Redis),redis_subscriber.lua(purge listener).
Prometheus Image (prometheus/Dockerfile)
Build selects prometheus.dev.yml or prometheus.prod.yml; default scrape includes OpenResty at 127.0.0.1:9145 and FastAPI /metrics.
Configuration & State Management
Redis is the single source of truth. Key namespaces include:
| Key Pattern | Purpose |
|---|---|
cdn:domains (set) | All managed domain names. |
cdn:dom:{domain} | JSON encoded domain configuration (DomainConf). |
cdn:nodes (set) | Known edge node IDs. |
cdn:node:{id} | Hash containing node IPs, health counters, and active flag. |
cdn:cert:{domain} (hash) | TLS fullchain and private key for SNI load in OpenResty. |
cdn:acme:{domain} | ACME registration state, including acme-dns credentials. |
cdn:acme:lock:{domain} / cdn:acme:task:{domain} | Concurrency locks for issuance tasks. |
cdn:tokens, cdn:token:{id}, cdn:token_hash:{hash} | Service token registry and lookups. |
cdn:acmeacct:global | Shared ACME account key material. |
cdn:acmecertkey:{domain} | Stored certificate private keys for reuse across renewals. |
Pub/sub channels include cdn:cname:dirty (for DNS zone rebuilds) and cdn:purge (cache invalidation).
Data Flows
- Domain Onboarding
- Node Lifecycle & Health
- Purge Flow
- TLS Issuance & Renewal
DNS & Traffic Engineering
- Hidden master (
HiddenMasterServer) serves authoritative responses and supports AXFR to NSD secondaries. - Zone generation picks
replicas_per_sitenodes per domain using rendezvous hashing (rendezvous_topk). - OpenResty routes per request via
access_router.luausing domain config from Redis; optional slice caching and rule-based cache controls.
Dependencies & Integrations
- Redis (
redis.asyncioin Python,resty.redisin Lua) – configuration store, cache, and pub/sub. - Celery – asynchronous task execution with Redis broker/backend; periodic health checks and ACME renewal scans.
- acme-dns – DNS-01 validation authority for ACME issuance.
- Caddy – development-only ACME CA; workers trust it via entrypoint CA bundle.
- NSD – authoritative DNS secondaries receiving NOTIFY from
HiddenMasterServer. - Prometheus & Grafana – metrics scraping and visualization for API and OpenResty.
Integration Interfaces (Third‑party)
- REST API – All control interactions are exposed under
/api/v1/*with cookie sessions or bearer service tokens. SeeControl Plane API & Data Modelfor endpoints and schemas. The default FastAPI docs are available at/docswhen enabled. - Metrics – Prometheus scrapes FastAPI (
/metrics) and OpenResty (9145/metrics). Use remote write or data source plugins to integrate with external monitoring/BI platforms. - Logs – OpenResty emits access/error logs to stdout/stderr and the Loki Docker logging driver forwards them. Query via Grafana/LogQL or export through the Loki API for downstream analysis.
- Redis Pub/Sub – The edge layer subscribes to
cdn:purge. External producers can publish purge events by writing JSON payloads{domain, path}to this channel (observe access controls). - DNS – Hidden master emits NOTIFY to NSD secondaries. External DNS stacks can slave from the hidden master using AXFR (allow list enforced in server settings).
- ACME – Integrates with
acme-dnsfor DNS‑01 and a local CA (Caddy) in development. Production environments can pointACME_DIRECTORYto a public CA.
Environments
The Docker composition targets local development; production deployments leverage Ansible playbooks referenced by the just deploy recipe (see ansible/). Environment variables (such as CNAME_BASE_DOMAIN, ACME_DIRECTORY, DNS_MASTER_PORT) must be tuned per environment. See the Appendices for a consolidated list.
Scalability Considerations
- API/server processes are stateless aside from in-memory session cache; they can scale horizontally if the session mechanism is migrated to Redis.
- OpenResty scales via additional nodes registered through the
/api/v1/nodesAPI; rendezvous hashing keeps per-domain assignment stable under churn and avoids hot-spotting. - DNS hidden master remains single-instance; NSD can scale out for regional redundancy.
- Redis is a critical dependency; consider managed or clustered (or multi-AZ) deployments for production workloads.
Use just up for local orchestration, just pytest for backend tests, and pnpm dev inside control-plane/frontend/ for UI iteration. Prefer uv run for Python entry points to respect uv.lock.