Blog
Insights, tutorials, and lessons learned from our platform engineering work.
GitOps Is the Right Model - But Not Before Your Platform Is Ready
GitOps gives you auditable, drift-free Kubernetes deployments. But scaled too early, it enforces inconsistency. A phased guide to GitOps adoption, repo structure, secrets management, and when to use Argo CD vs Flux.
There Is No Correct Platform Team Size
The right platform team size isn't a ratio. It's a function of how much complexity the platform carries. Here's why some small teams support hundreds of engineers while large teams struggle with fifty.
Nobody Decided to Have 100 Kubernetes Clusters
Kubernetes cluster sprawl is one of the most expensive problems in platform engineering. A decision framework for multi-cluster management, consolidation, and when a new cluster is actually justified.
How Platform Teams End Up With Six-Figure Observability Bills
A practical comparison of Datadog vs the open-source LGTM stack (Loki, Grafana, Tempo, Mimir). How observability costs spiral, what the migration looks like, and when open source is the right move.
If You Rotated Every Credential Today, Something Would Break
Secrets sprawl makes credential rotation dangerous instead of routine. A practical guide to consolidating Kubernetes secrets management, building a rotation strategy, and eliminating credentials scattered across five systems.
Platform Teams Don't Get Cut Because They're Not Valuable
Platform teams get cut because they can't prove value, not because they lack it. How product ownership drives adoption, the metrics that matter, and how to make platform engineering visible to leadership.
Welcome to the KubeWright Blog
Lessons from real platform engineering engagements - the decisions that matter, the mistakes that cost, and what actually works at scale.