Omni-Sync
Back to ProjectsPlatform

Omni-Sync

Collaborative SRE Playbook Engine handling multi-user state synchronization with CRDTs and WebSockets. Embeds live Kubernetes telemetry and streaming logs into playbooks.

GoNext.jsTiptapRedisYjsWebSockets

The Problem

During Sev-1 incidents, Site Reliability Engineers use static wiki pages while jumping between 5 different terminal windows. Disconnected states lead to duplicate debugging and extended downtime.

System Architecture

A Next.js frontend utilizes Tiptap and Yjs for Conflict-free Replicated Data Types (CRDTs). The Go backend manages active WebSocket hub subscriptions, piping live Prometheus metrics and fluentd logs directly into the text editor blocks.

System architecture diagram — coming soon

Technical Challenges & Trade-offs

Handling WebSocket reconnects and preventing CRDT state corruption during network partitions. Engineered a robust offline-first synchronization queue that replays operations to the Redis persistence layer upon reconnection.

Business Impact & Metrics

Decreased Mean Time To Resolution (MTTR) by 40% by putting live telemetry securely in the same collaborative view as the incident playbooks.