← Back to Patterns

How we decide whether a transformation belongs in SQLX, code, or orchestration

We keep transformations in SQLX by default, move to code when the logic truly stops being legible in SQL, and keep orchestration for sequencing rather than business meaning.

By Ivan Richter LinkedIn

Last updated: Mar 24, 2026

4 min read

On this page

The default

We keep transformations in SQLX by default, use code when code is actually clearer, and keep orchestration for coordination rather than meaning.

That isn’t a language purity rule. It’s a visibility rule. We want the behavior of the data platform to live in the layer where a reviewer would naturally expect to find it.

If someone has to read scheduler branches, helper internals, and runtime flags just to understand what a table means, the boundary is already wrong.

Why SQLX is the default

A large share of transformation work really is best expressed as named models with visible dependencies.

When the job is to define grain, join sources, apply business filters, compute metrics, and materialize a stable table shape, SQLX is usually the honest layer. The model lives where people expect it to live. Its inputs are visible. Its shape is visible. A reviewer can open the file and see the logic that defines the table instead of reconstructing it from scattered steps elsewhere.

That’s the practical benefit behind reviewable transformations. The point isn’t that SQLX is somehow virtuous. The point is that the repo stays inspectable when the transformation sits next to the model it defines.

What belongs in SQLX

Model semantics.

If the work is about what a row represents, how facts get joined, which filters define the business entity, or what columns make up the contract of the downstream table, we want that logic close to the model.

That includes the parts people sometimes try to smuggle elsewhere. Business filters. Join choices. Grain decisions. Metric definitions. Column-level shaping that changes what the table actually says. Those aren’t just implementation details. They’re part of the model.

That’s the same reasoning behind decision boundaries. The model should describe the business shape directly. It shouldn’t outsource meaning to runtime branches in some other layer because the SQL started feeling a little inconvenient.

What belongs in code

Some logic really is better in code.

Complex text normalization, reusable parsing, metadata expansion, API interaction, or helper behavior that would make the SQL materially harder to read may deserve code. Sometimes SQL can express the logic, but only in a way that makes the model worse to review. That’s the point where code earns its keep.

But the boundary still has to stay honest.

Code should support the model, not quietly become the place where the real business behavior lives. If the helper is doing something central to what the table means, and the SQLX now reads like a thin wrapper around mystery meat, the abstraction didn’t help. It just relocated the important part of the system to a place fewer people will inspect.

That’s the same judgment behind earned abstraction. An abstraction helps when it makes the calling layer easier to understand. If it turns the model into a riddle, it failed.

What belongs in orchestration

Orchestration should own sequencing, scheduling, dependency execution, retries, and operational controls. It should not own model semantics.

If the real transformation logic only becomes visible once someone reads task branches, runtime arguments, or workflow conditionals, then the workflow layer has become the semantic layer. At that point, the system might still run, but review gets ugly fast because the meaning of the data is no longer where the data is defined.

That’s what we’re trying to avoid with orchestration boundaries. Workflows should tell you when work runs, in what order, and under what operational conditions. They shouldn’t be the place where you discover what the table actually does.

It’s the same instinct behind Pulumi config boundaries. Different stack, same discipline. The layer that sequences or configures work shouldn’t quietly become the place where the real behavior lives.

Why this ages better

Declarative layers age better because they keep intent visible.

A model in SQLX is easier to inspect than behavior scattered across scripts, wrappers, and workflow flags. A reviewer can read the shape of the transformation without having to replay a miniature runtime in their head. That’s not about elegance. It’s about reducing how much hidden state a person has to carry just to review one change properly.

That’s part of why declarative models age better than script-driven piles. They don’t eliminate complexity. They just keep more of the important complexity where people can still see it.

The decision rule

Keep logic in the highest-level layer that can still express it clearly.

If SQLX is honest and readable, keep it there. If code makes the model clearer, use code. If the problem is just coordination, use orchestration and keep it thin.

The wrong answer isn’t using code. The wrong answer is hiding model behavior in a layer nobody would naturally review.

More in this domain: Data

Browse all

Related patterns