Puddle
F1-Grade Offline Telemetry Stack
OFFLINE · F1-INSPIRED · SRE-GRADE

Puddle — F1 Telemetry, Offline. Fastest Tick Pipeline Possible.

A complete F1-style telemetry system running entirely on a local external SSD. Real-time tick ingestion, Go backend, assembly math engine, observability, GitOps, chaos engineering — all inside a k3d cluster, fully offline.

Explore the Stack Runbooks
Core goal: Process tens of thousands of ticks per second, compute live F1-style predictions (lap time, sectors, gaps), and protect it with real SRE discipline.

Ultra-Fast Telemetry

Tick collectors, EKF processor, Redis cache, prediction API — optimized for microsecond-level performance.

Assembly-Enhanced Math

EKF and matrix operations can run via AVX2/AVX-512 assembly for maximum throughput.

Offline DevOps Lab

Everything (images, charts, repos, dashboards) is local — perfect for air-gapped SRE practice.

Mission Control UI

Web-based dashboard showing live gaps, sector predictions, tick rate, node health, alerts, traces.

Chaos Engineering

Inject timing delays, network loss, pod failures, disk pressure — simulate real incidents.

GitOps-Driven

All manifests stored in Gitea and deployed via ArgoCD, fully auditable and versioned.

TECHNICAL ARCHITECTURE

F1 Telemetry Stack (Offline)

Six functional domains — hover for details.
Linux • External SSD • k3d
Infra
Linux + k3d
Single-host cluster (k3d) running entirely on an external SSD. No VMs, no cloud dependencies.
Platform
ArgoCD · Gitea · Registry
GitOps pipeline using Gitea as the internal repo and ArgoCD as the deployment controller. Local registry mirrors all images for air-gapped operation.
Telemetry Engine
Go + Assembly
Go microservices for collectors, EKF processor and predictions. Optional SIMD/AVX assembly implementation for extreme performance.
Data Layer
Redis · Parquet
Redis cluster for low-latency state; Parquet writer for historical archival on SSD.
Observability
Prom · Grafana · Jaeger · Loki
Metrics, logs, dashboards and distributed tracing built-in. Track latency, throughput and EKF computation cost.
Reliability
Chaos Mesh · SLOs
SLO-driven reliability targets with automated chaos tests validating the system's ability to sustain tick load.
QUICK CONFIG

Top-level tooling

k3d • Go Backend • Redis • ArgoCD • Gitea • Istio • Prometheus • Grafana • Loki • Jaeger • Chaos Mesh
# Local cluster
k3d cluster create puddle --agents 3 \
  --registry-create registry.local:5000

# Access UI
https://dashboard.spaceship.local
https://monitoring.spaceship.local

# Backend example
GET /api/live/gap/1/16
SLO
99.99% of ticks processed under 50ms end-to-end.
Error budget used: 8%
RUNBOOKS

On-call Playbooks

Runbooks cover incidents such as tick-lag spikes, EKF degradation, Redis failures, node loss, pod storms, chaos experiment fallout and more.

Browse runbooks Repository

RB-F1-001
Tick Pipeline Latency Spike — investigation
RB-F1-002
Redis Hot Key — cache pressure mitigation
RB-F1-003
Node Loss — rescheduling and rebalancing
RB-F1-004
EKF Divergence — assembly fallback activation
DOCUMENTATION

Quick Links

  • Start: `./scripts/launch.sh` — boots k3d + registry + monitoring.
  • Backend: Go services for collector, EKF, API, broadcaster.
  • Math Kernel: Optional AVX2 SIMD assembly for matrix ops.
  • Telemetry UI: Mission Control dashboard (SvelteKit).
  • Chaos: Automated experiments validating resilience.