← Back to Patterns

Constraints without enforcement: still worth it?

Non-enforced constraints are useful when they tell the truth. They act as semantic contracts and optimizer hints, but they become actively dangerous the moment the warehouse is asked to trust a lie.

By Ivan Richter LinkedIn

Last updated: Mar 29, 2026

6 min read

On this page

Non-enforced doesn’t mean meaningless

BigQuery doesn’t enforce primary keys or foreign keys. That’s the part everybody repeats, usually with the tone of someone announcing they’ve found the loophole in the system. Fine. The more useful question is what a declared constraint is still worth when the warehouse won’t police it for you.

Quite a lot, if the declaration is honest.

A good constraint does two jobs at once. It tells the next person reading the model what one row is supposed to be, and it tells the engine what shape it’s allowed to assume. That’s not decorative. A primary key says “this is identity.” A foreign key says “this relationship is stable enough to depend on.” When those claims match reality, the table gets easier to read and, sometimes, the warehouse gets more room to optimize. That’s a useful trade. You get clearer semantics and, in some cases, better execution.

The problem isn’t that BigQuery leaves enforcement to you. The problem is that people hear “not enforced” and start treating declarations like aspirational metadata. That’s where this goes bad. A non-enforced constraint is still a claim, and once you publish it, you’re asking both humans and the warehouse to trust that claim.

A key declaration is a statement about identity

Declaring a key is one of the strongest things a model can say about itself.

If you declare a primary key, you’re saying the grain is settled. You’re saying one row represents one thing, that the thing has a stable boundary, and that duplicates aren’t part of normal behavior. If you declare a foreign key, you’re saying the parent-child relationship is real enough that downstream logic can treat it as a dependable shape rather than a best-effort association.

That only helps when the model has already earned the right to say it. If the grain still takes a whiteboard session and three caveats to explain, it isn’t ready. If the entity boundary keeps shifting depending on who built the upstream table, it isn’t ready. If the relationship holds only on clean days and quietly falls apart during backfills, late-arriving data, or partial loads, it isn’t ready.

The dangerous part is not the missing enforcement

The usual complaint is still technically true: BigQuery won’t stop bad data from landing just because you declared a key. That isn’t what tends to do the damage. The damage starts when a key gets declared anyway and then left in place long after the model stopped deserving it.

BigQuery can use declared constraints for optimization. That’s helpful when the declaration is true. When it isn’t, you’ve handed the warehouse permission to trust a bad assumption. At that point this isn’t just a documentation problem. The schema has moved from vague to confidently wrong, which is a much worse failure mode. Vagueness slows people down. False certainty lets them break things faster.

So the standard has to be stricter than people usually want. If duplicates still show up as part of normal system behavior, don’t declare uniqueness. If missing parents are common enough that everybody quietly builds around them, don’t declare the foreign key. If the grain still shifts when a new source lands or a reporting definition changes, leave the constraint out and fix the model first. Same rule as with unique keys in incrementals: you don’t publish stability because you’d prefer the table to behave that way. You publish it after the table already does.

Validation has to exist somewhere else

Because BigQuery won’t enforce the rule at write time, you need some other mechanism that does the enforcing. Tests, assertions, reconciliation queries, load checks, ownership, review gates, or some combination of them. The model needs a real way to notice drift before drift becomes normal.

Without that, a non-enforced key is just a hopeful sentence in DDL.

At that point warehouse design often turns theatrical. Teams declare constraints because the model “should” have them, then never put any validation around the claim. The table looks more finished, everybody gets to feel tidy, and nothing about the actual risk changes. The first time uniqueness drifts or referential integrity starts failing, the constraint stays in place because removing it would force an awkward admission: the schema was describing the model people wanted, not the one they actually had.

That’s useless at best and dangerous at worst. The warehouse isn’t helped by statements that only sound right in a design review.

What truthful constraints buy you

The table becomes easier to understand without reverse-engineering its behavior from sample data and query history. A reader can see the grain. They can see where identity lives. They can see which relationships are supposed to hold. That’s not minor. A lot of analytical work gets built on top of tables that technically run but never state their own semantics clearly. Then everyone wonders why downstream logic keeps splintering.

There is also an execution upside. BigQuery can make stronger decisions when it’s allowed to assume the declared shape is true. That’s useful, but it’s secondary. We wouldn’t declare constraints for optimizer hints alone. The contract comes first. Performance is what you get after the contract is already trustworthy.

And none of this rescues a weak model. Constraints don’t fix muddled grain. They don’t clean up unreliable upstream ingestion. They don’t replace sane table design, reviewable transformations, or validation. They help a good model speak plainly. That’s the job.

create table mart.orders (
  order_id string not null,
  customer_id string,
  order_date date,
  net_revenue numeric,
  primary key (order_id) not enforced,
);

create table mart.order_lines (
  order_line_id string not null,
  order_id string not null,
  sku_id string,
  quantity int64,
  foreign key (order_id) references mart.orders(order_id) not enforced,
);

That schema is useful only if mart.orders.order_id is genuinely unique and mart.order_lines.order_id really does reference valid parent rows under normal operating conditions. If those assumptions fail regularly and the declarations stay in place, the schema has stopped describing the warehouse. Now it’s inventing one.

How we treat them

We don’t declare keys because the model would look more mature with them. We declare them after the identity story is already stable.

That means the grain can be stated plainly. The entity boundary doesn’t change every time somebody asks a new reporting question. Duplicates are actually exceptional. Parent-child relationships hold under real system behavior, including the ugly parts like incremental updates, late data, and operational churn. And there is validation somewhere outside good intentions.

If that standard isn’t met, the constraint stays out. No compromise, no schema theater, no pretending the declaration itself will pull the model into shape later. It won’t.

The rule

Non-enforced constraints are worth declaring when they tell the truth and keep telling it.

Declare them after identity is real. Back them with validation. Treat them as claims the warehouse is allowed to trust. If the declaration is still aspirational, leave it out.

More in this domain: Data

Browse all

Related patterns