"Internal-only" Cloud Run isn't just a checkbox
Making a Cloud Run service private is not one toggle. It is a decision about ingress, routing, caller path, and IAM working together as one access model.
On this page
The real object is private reachability
Making Cloud Run “internal” isn’t one setting. It’s an access model.
Cloud Run has more than one way in. The default run.app URL. Any custom domains. Any external or internal load balancers you put in front of the service. Ingress settings decide which of those paths are even allowed to reach the service. IAM still decides who’s allowed through once a request gets there.
So the thing you’re actually designing is not “we clicked internal in the console.” It’s private reachability.
What the ingress modes actually change
Cloud Run gives you three ingress settings.
all is the least restrictive. Direct access to the default URL is allowed. Public access is possible if IAM allows it. Nothing unusual there.
internal-and-cloud-load-balancing is the useful middle shape. It still allows the sources covered by the more restrictive internal model, but it also allows traffic through an external Application Load Balancer. That means you can put things like IAP, Cloud Armor, or CDN in front of the service without leaving the default endpoint open to the internet.
internal is the tightest setting. That blocks internet access to the default run.app URL and custom domains, and only allows traffic that Cloud Run recognizes as internal.
The important part is that these settings describe allowed paths, not a complete security model by themselves.
”Internal” is broader than it sounds
This is where people usually get sloppy.
“Internal” doesn’t just mean “inside one subnet” or “inside our VPC.” Cloud Run has its own definition. Same-project VPC traffic can count as internal. Some Shared VPC cases count as internal. VPC Service Controls can matter. An internal Application Load Balancer counts. And a specific set of Google-managed callers can also count as internal when the conditions line up.
That makes the model more useful, but also easier to misunderstand. “Internal” isn’t a folk concept here. It’s a documented routing model with platform-specific rules attached to it.
The caller path matters as much as the setting
One of the easier mistakes is assuming the destination setting is the whole story.
It isn’t.
If a Cloud Run service or App Engine service calls another Cloud Run service that is set to internal or internal-and-cloud-load-balancing, the request doesn’t magically count as internal just because both services live in the same cloud account and everybody involved feels private in spirit. The traffic has to use a path Cloud Run recognizes as internal.
That means the source side matters too. VPC routing matters. Private Google Access can matter. Private Service Connect can matter. Internal load balancing can matter. The access model is defined by how the request actually gets there, not just by what the destination dropdown says.
IAM still matters
Ingress isn’t authorization.
You can make the network path restrictive and still have the wrong IAM shape. You can also get IAM right and still expose a path you didn’t mean to expose. Those are different failures.
That’s why “internal-only” is never really about one control. Ingress says which paths are allowed. Routing says how traffic reaches them. IAM says who’s allowed to call once they do.
Drop any one of those and the system starts relying on assumptions instead of design.
Google-managed callers make this weirder
Cloud Run also has a category of Google-managed callers that are treated specially under the internal model.
Useful, but it’s exactly the kind of thing you forget six months later and then get surprised by during an access review. Scheduler, Tasks, Eventarc, Pub/Sub, Workflows, BigQuery, and a few other services can count as internal when they are in the same project or VPC Service Controls perimeter and use the default run.app URL.
That doesn’t make the system worse. It just means “internal” isn’t synonymous with “came from our VPC.”
Private access is still a routing problem
If the goal is “only private clients should reach this service,” the real question becomes which routing shape you want to depend on.
Sometimes Private Google Access is enough. Sometimes you want Private Service Connect because you want an internal IP shape for run.app. Sometimes you want an internal Application Load Balancer because the service is one backend in a broader private routing design.
Those are different architectures. The networking story matters more than the label. Once the service is meant to be private, ingress and routing stop being console trivia and start becoming part of the runtime design.
This still fits Cloud Run just fine
None of this means Cloud Run stops being a good fit for internal services.
It’s still a strong default for internal APIs, automation endpoints, private tools, and smaller service surfaces where the runtime model still fits. The problem isn’t that Cloud Run can’t do private access. The problem is pretending private access is simpler than it is.
It isn’t one toggle. It’s a combination of ingress, caller path, routing, and IAM all lining up on purpose.
When the shape starts getting heavier
At some point, the networking story can become the architecture project.
If private reachability rules are spreading across a lot of services, if east-west traffic and service topology are getting denser, or if ingress, routing, and policy boundaries are becoming the main thing you’re designing around, then the runtime question may need to move too.
At that point, Cloud Run can still be good for some workloads while the broader system starts wanting something more cluster-shaped.
The point
Private access is a system shape, not a checkbox.
Ingress matters. Caller path matters. Routing matters. IAM still matters. If those pieces aren’t designed together, “internal-only” is just a comforting label sitting on top of a fuzzy access model.
More in this domain: Infrastructure
Browse allHow we decide between Cloud SQL connectors, Auth Proxy, and private IP
Cloud SQL connectors, the Auth Proxy, and private IP are not interchangeable secure connection options. They change identity, routing, deployment shape, and how much network plumbing the team actually owns.
IAM DB auth for Cloud SQL: when it simplifies security and when it complicates delivery
IAM DB auth can reduce password sprawl and make revocation cleaner, but it also turns database access into an identity operating model that depends on disciplined service-account boundaries.
Safe scaling defaults for Cloud Run + Postgres
Cloud Run autoscaling is not a database strategy. Safe defaults keep the application from scaling itself into a Postgres incident before the team understands the workload.
Cloud Run request timeouts don't kill your code (so your architecture has to)
A Cloud Run request timeout ends the request, not necessarily the work. If the operation can outlive its caller, the system needs explicit job semantics instead of hope.
Cloud Run scaling from zero is a feature until it isn't
Scale to zero is a good default for request-driven services, until startup delay, warm-capacity needs, or instance caps turn it into user-visible reliability behavior instead of a pricing feature.
Related patterns
Direct VPC egress vs Serverless VPC Access for Cloud Run: our default
We default to Direct VPC egress for Cloud Run because it is the cleaner networking shape: fewer moving parts, no connector resource, and costs that scale with the service instead of beside it.
GKE Autopilot as the escape hatch from Cloud Run
When Cloud Run stops fitting, the next move is usually GKE Autopilot: more Kubernetes-shaped control without immediately taking on the full burden of Standard clusters.
Why we default to Cloud Run for SME internal platforms
For SME internal platforms, Cloud Run is our default because it covers a large share of useful workload shapes without forcing teams to own cluster operations before they have earned that surface area.
When repeated Pulumi code earns abstraction and when it doesn't
We don't abstract repeated Pulumi code just because it shows up more than once. We do it when the shared shape is real, the behavior is stable enough to deserve a boundary, and the result is easier to read than the duplication it replaces.