← All Case Studies
Media & Entertainment / UK Media Company

Platform Modernisation & £450k Annual Savings

Modernised a large-scale Kubernetes platform serving major streaming services - reducing costs by £485k/year, replacing legacy monitoring and logging, and building an Internal Developer Platform enabling engineering self-service.

£485k Annual cost savings
~100 Kubernetes clusters
~30 Platform engineers
2000+ Monitoring checks migrated
Technologies
AWSEKSTerraformTerragruntPrometheusGrafanaLokiTempoGitHub ActionsJenkinsOpenTelemetrySlothApache DevLakePythonRuby

Results

  • £450,000/year in cost savings from retiring a legacy logging platform and migrating to a modern, Kubernetes-native stack
  • £35,000/year in additional savings from consolidating hundreds of load balancers to shared infrastructure - completed in 3 weeks
  • ~100 Kubernetes clusters managed across a standardised Common Platform
  • ~30 platform engineers supported across the team, from junior to lead level, with structured mentoring and hands-on upskilling
  • 2000+ monitoring checks migrated from a manually configured legacy system to a standardised, self-service alerting framework
  • Internal Developer Platform established with opinionated infrastructure modules, built-in alerting, and SLO framework - enabling engineering teams to self-serve
  • CI/CD modernisation with migration from legacy pipelines to GitHub Actions, including self-hosted Kubernetes-backed runners
  • End-to-end tracing rolled out across the platform through OpenTelemetry instrumentation
  • DORA metrics and engineering intelligence introduced to give senior stakeholders visibility into delivery performance and platform adoption
  • Organisation-wide backup strategy for databases and object storage across the entire AWS estate
  • Re-engaged for a second engagement after the initial programme - a direct signal of the value delivered

The Problem

The Common Platform powers major streaming services, news platforms, and OTT products. A platform team of approximately 30 engineers was responsible for around 100 Kubernetes clusters, but the platform had grown with significant technical debt:

  • Legacy logging infrastructure with high licensing costs and operational overhead - no path to scale without increasing spend
  • Fragmented monitoring requiring manual configuration for over 2000 individual checks, with no self-service capability for engineering teams
  • Inefficient load balancing with hundreds of individual load balancers instead of shared infrastructure
  • No developer self-service - engineering teams were dependent on the platform team for common infrastructure tasks, creating bottlenecks and slowing delivery
  • Limited visibility into platform health and delivery performance - no SLO framework, no engineering metrics, no way to measure improvement

What We Delivered

Observability Modernisation

Replaced the legacy logging and monitoring stack with a modern, Kubernetes-native observability platform. This eliminated the largest single infrastructure cost on the platform while giving engineering teams better visibility than they had before. We migrated 2000+ monitoring checks to a self-service alerting framework, allowing developers to define their own metric, log-based, and script-based alerts without platform team involvement.

Internal Developer Platform

Designed and built an IDP that shifted common infrastructure tasks from the platform team to engineering self-service. This included opinionated infrastructure modules with built-in alerting, a standardised SLO framework, and self-service CI/CD runners. The platform team’s role shifted from fulfilling requests to maintaining and improving the platform itself.

Distributed Tracing and Telemetry

Rolled out instrumentation across the platform to enable end-to-end tracing, with the goal of unifying metric collection, log shipping, and trace correlation into a single telemetry pipeline.

Engineering Intelligence

Introduced DORA metrics and KPI tracking to give senior stakeholders visibility into platform adoption and delivery performance - enabling data-driven decisions about platform investment.

Backup and Disaster Recovery

Architected and implemented an organisation-wide backup strategy for databases and object storage, ensuring data protection and compliance across the entire AWS estate.