Why we default to Cloud Run for SME internal platforms

For SME internal platforms, Cloud Run is our default because it covers a large share of useful workload shapes without forcing teams to own cluster operations before they have earned that surface area.

Decision memo Infrastructure

By Ivan RichterLinkedIn

Last updated: Mar 25, 2026

4 min read

cloud-run internal-platforms platform-design

On this page

The default

For SME internal platforms, Cloud Run is usually our default runtime.

For small and mid-sized teams, operational simplicity compounds harder than theoretical flexibility. Cloud Run is a default with an expiry condition. Kubernetes remains available when the workload earns it.

Cloud Run lets us run real services and real jobs without taking on a cluster before the workload’s earned one. For a lot of internal systems, that’s the right trade for a surprisingly long time.

The target is not “everything”

This default is aimed at a specific kind of platform work.

Internal APIs. Scheduled jobs. Event-driven handlers. Admin backends. Automation endpoints. Reporting helpers. Small business systems. Integration services. Internal tools that are important enough to deserve proper deployment and observability, but not so demanding that they need a whole Kubernetes-shaped world on day one.

That’s the point of the default. A lot of internal platform work needs to exist, be dependable, and stay easy to operate. It doesn’t need to start with a cluster just because clusters exist.

What Cloud Run buys us

The main win is ownership we don’t have to take on.

No node management. No cluster lifecycle. No node pools to size before the traffic pattern is even real. No control plane sitting there as a standing tax on every small internal service. Build the container. Deploy the revision. Run the thing. Keep moving.

That sounds obvious until you’ve seen how much time teams burn owning infrastructure shape they didn’t need yet.

Cloud Run keeps the runtime model small, and that matters more than people admit.

Why this works so well for small teams

Small teams usually lose by owning too much too early. Too many moving parts. Too many things that can be misconfigured. Too many layers that have to be understood before anybody can ship a change safely.

Cloud Run is strong in exactly that environment because it strips out a large chunk of platform overhead without forcing the team into toy constraints. Services still feel like services. Jobs still feel like jobs. Observability, deployment, IAM, and revision management still look like grown-up infrastructure without requiring a cluster as proof.

The workload still has to fit

This default only works while the workload shape stays honest.

If the system is request-driven, event-driven, or job-shaped in a way Cloud Run can represent cleanly, good. That’s where the platform is at its best.

If the service is pretending a fragile request lifecycle is a durable work model, the fit gets worse fast. That’s the boundary behind request timeouts. A timeout isn’t job control. A request ending isn’t the same thing as the work ending. A design that depends on that confusion is the problem.

Same thing with scaling to zero. It’s a great default when request-driven wake-up is actually acceptable. It’s a worse fit when the workload wants warm capacity, continuous background activity, or different failure behavior than the request model gives you.

Networking has to stay simple enough to be worth it

For small teams, runtime simplicity is only half the story. The network story matters too. One reason Cloud Run stays attractive is that the surrounding infrastructure can stay light when the service shape is still simple. Because of that we prefer Direct VPC egress as the normal networking shape. Fewer extra resources. Fewer sidecar infrastructure decisions. Less platform drag.

The same goes for private services. Cloud Run can handle private workloads just fine, but “private” still has to be designed as a real access model, with ingress, caller path, routing, and IAM all lining up cleanly.

The default is strong because it has a boundary

A lot of bad platform decisions come from treating defaults like identity.

Cloud Run is good for a lot of SME internal platform work. A good default starts from the smallest honest model and expires once the workload has clearly moved on.

The default is opinionated and still leaves room for evidence to replace it.

Where the default stops applying

Eventually some systems stop being mostly Cloud Run-shaped.

Maybe the workload wants broader Kubernetes APIs. Maybe service topology gets denser. Maybe networking and private reachability start becoming a bigger design surface. Maybe the container estate is growing into something that wants cluster-native controllers, policies, and composition.

When that happens, the usual next step is GKE Autopilot. It adds Kubernetes-shaped control without immediately signing the team up to spend its afternoons thinking about nodes.

The point

We default to Cloud Run because it’s often the smallest runtime model that keeps delivery fast and ops boring, which is exactly what most SME internal platforms need.

Use it while it keeps the workload honest. Leave it when the system has actually earned more surface area.

More in this domain: Infrastructure

Browse all

How we decide between Cloud SQL connectors, Auth Proxy, and private IP

Cloud SQL connectors, the Auth Proxy, and private IP are not interchangeable secure connection options. They change identity, routing, deployment shape, and how much network plumbing the team actually owns.

Safe scaling defaults for Cloud Run + Postgres

Cloud Run autoscaling is not a database strategy. Safe defaults keep the application from scaling itself into a Postgres incident before the team understands the workload.

IAM DB auth for Cloud SQL: when it simplifies security and when it complicates delivery

IAM DB auth can reduce password sprawl and make revocation cleaner, but it also turns database access into an identity operating model that depends on disciplined service-account boundaries.

Cloud Run request timeouts don't kill your code (so your architecture has to)

A Cloud Run request timeout ends the request, not necessarily the work. If the operation can outlive its caller, the system needs explicit job semantics instead of hope.

Cloud Run scaling from zero is a feature until it isn't

Scale to zero is a good default for request-driven services, until startup delay, warm-capacity needs, or instance caps turn it into user-visible reliability behavior instead of a pricing feature.

Related patterns

Direct VPC egress vs Serverless VPC Access for Cloud Run: our default

We default to Direct VPC egress for Cloud Run because it is the cleaner networking shape: fewer moving parts, no connector resource, and costs that scale with the service instead of beside it.

GKE Autopilot as the escape hatch from Cloud Run

When Cloud Run stops fitting, the next move is usually GKE Autopilot: more Kubernetes-shaped control without immediately taking on the full burden of Standard clusters.

"Internal-only" Cloud Run isn't just a checkbox

Making a Cloud Run service private is not one toggle. It is a decision about ingress, routing, caller path, and IAM working together as one access model.

When repeated Pulumi code earns abstraction and when it doesn't

We don't abstract repeated Pulumi code just because it shows up more than once. We do it when the shared shape is real, the behavior is stable enough to deserve a boundary, and the result is easier to read than the duplication it replaces.