There is no correct platform team size.
The question keeps coming up: “What is the right ratio of platform engineers to developers?” 1:10? 1:20? 1:50?
It’s the wrong question.
Platform teams don’t scale linearly with headcount. They scale with how standardised the platform is, how much variation they allow, and how much they own versus delegate.
We’ve seen small platform teams supporting hundreds of engineers. We’ve seen large platform teams struggling to support fifty. The difference wasn’t talent. It was design.
Why Ratios Don’t Work
The ratio model assumes that platform engineering effort scales proportionally with the number of engineers using the platform. It doesn’t.
A platform team supporting 200 engineers on a well-standardised platform with three deployment patterns, one observability stack, and clear self-service workflows might need five people. The same team supporting 50 engineers across six different deployment models, three monitoring tools, and constant ad hoc requests might need fifteen.
The variable isn’t how many people use the platform. It’s how much complexity the platform carries.
What actually drives platform team size
Variation - every additional pattern the platform supports requires ongoing maintenance, documentation, on-call coverage, and upgrade management. Two ingress controllers means two sets of runbooks, two upgrade lifecycles, two sets of edge cases to understand.
Manual intervention - if deploying a new service, creating a namespace, or provisioning infrastructure requires a platform engineer to do something manually, the team’s capacity is directly constrained by the number of requests.
Ownership breadth - a platform team that owns everything from CI pipelines to Kubernetes clusters to observability to developer tooling has a fundamentally different workload than one focused on infrastructure only.
Incident surface - more components, more clusters, and more variation means more things that can break in more ways. On-call burden scales with the number of distinct failure modes, not the number of users.
Technical debt - legacy patterns that haven’t been deprecated, workarounds that became permanent, and systems that need replacement all consume platform engineering time without delivering new value.
The Two Platform Team Archetypes
In practice, most platform teams fall into one of two patterns:
The scaling team
This team has invested in standardisation and self-service. Their work looks like:
- A small number of well-defined, well-documented patterns
- Self-service workflows for common operations (new service, new environment, new namespace)
- Golden paths that handle 90% of use cases without platform team involvement
- Clear boundaries - the platform team owns the platform, not every team’s deployment problems
- Automated guardrails (admission policies, CI checks) that enforce standards without human review
This team can grow their user base without proportionally growing their headcount. Adding 50 more engineers to the platform doesn’t meaningfully increase the team’s workload if those engineers are using the standard patterns.
The scaling team typically operates at ratios of 1:40 or higher.
The absorbing team
This team has become the place where operational complexity goes to live. Their work looks like:
- Multiple ways to do the same thing, all of which are “supported”
- Manual steps in common workflows that require a platform engineer
- Frequent one-off requests that don’t fit existing patterns
- Tribal knowledge instead of documentation
- The platform team sits in the critical path for most changes
This team’s workload grows with every new user, every new service, and every new exception. They feel permanently understaffed because the demand is directly coupled to usage.
The absorbing team typically operates at ratios of 1:10 or worse - and still feels stretched.
How to Evaluate Your Platform Team’s Size
Rather than asking “how many people do we need?”, ask these questions:
How much of the platform team’s time is spent on recurring requests?
Track it for two weeks. If more than 30% of the team’s time goes to tickets, deployments, and ad hoc requests, the problem isn’t headcount - it’s a lack of self-service. Every recurring request is a missing automation.
How many supported patterns exist for common operations?
Count the number of distinct ways teams deploy services, manage configuration, handle secrets, and access infrastructure. Each additional pattern is a multiplier on operational load. If you have five ways to deploy a service, you have five times the maintenance, documentation, and on-call surface.
What would happen if you said no to exceptions for three months?
Most platform teams allow exceptions because saying no feels like obstruction. But exceptions compound. Each one adds a pattern to support, a edge case to handle, and a deviation from the standard.
If you hypothetically froze exceptions for three months, would the platform still serve 90% of use cases? If yes, the exceptions are adding cost without proportionate value. If no, the standard patterns have gaps that need addressing.
How often does the platform team sit in the critical path?
Map the common developer workflows - deploying a service, creating an environment, debugging a production issue. For each one, identify whether the platform team is required or optional.
Every workflow where the platform team is required is a bottleneck. The goal is to move the platform team out of the critical path for routine operations and into an enabling role for complex ones.
What’s the on-call burden?
If your platform team is getting paged frequently, the question isn’t whether you need more people on the rotation. It’s why the platform generates that many alerts. High on-call burden is usually a symptom of insufficient standardisation, missing automation, or unresolved reliability problems.
Reducing the Need for Headcount
The path to a smaller, more effective platform team isn’t cutting people. It’s reducing the complexity they need to manage.
Constrain variation ruthlessly
This is the single highest-leverage change a platform team can make. Fewer supported patterns means less maintenance, less documentation, less on-call surface, and less cognitive load.
Practically, this means:
- Defining a standard deployment model and migrating teams to it
- Deprecating legacy patterns with firm timelines
- Saying no to exceptions unless there’s a genuine technical requirement that the standard cannot satisfy
- Making the standard path easier than the exception path
Invest in self-service
Every operation that requires a platform engineer to perform manually is a scalability constraint. Common candidates for self-service:
- New service provisioning
- Environment creation
- Namespace and resource quota management
- Access and permissions management
- Certificate provisioning
- DNS record management
The implementation doesn’t need to be complex. A well-structured Terraform module with a PR-based workflow, a simple internal CLI tool, or even a documented kubectl command that teams can run themselves can eliminate a category of manual work.
Define clear ownership boundaries
Platform teams that own “everything infrastructure” inevitably absorb work that belongs elsewhere. Clear boundaries reduce scope creep:
- The platform team owns the platform - the shared infrastructure, tooling, and standards
- Application teams own their applications - deployment configuration, resource tuning, application-level monitoring
- A clear interface between the two - the golden path, the self-service tools, the documentation
When a request comes in, the first question should be “does this belong to the platform, or to the team’s application?” Not every infrastructure-adjacent problem is a platform team problem.
Automate guardrails, not approvals
Policy enforcement shouldn’t require a human in the loop. Admission controllers, CI pipeline checks, and automated compliance scanning can enforce standards at the point of change rather than through manual review.
This shifts the platform team from a gatekeeper role to a standards-setting role. They define the rules; the automation enforces them.
The Takeaway
The right platform team size isn’t a number or a ratio. It’s a function of how much complexity the platform carries and how much of the team’s capacity is consumed by variation, manual work, and exception handling.
Well-designed platforms reduce the need for platform involvement over time. That’s how a small team supports a large organisation - not by working harder, but by designing a platform that requires less human intervention to operate.
If your platform team feels permanently understaffed, the answer probably isn’t more people. It’s less variance.
If you’re trying to figure out the right size and shape for your platform team, we can help you work through that.