Subscribe to GEN
Login to GEN
Add a Comment
One of our customers recently ran an AI-generated SQL query that ended up costing just over £3,000 in data lake charges. The interesting part is not that the query was slow. It is that it was structurally wrong in a way that made runaway execution almost inevitable, and the person who pasted it in had no real reason to suspect just how expensive that mistake could become.
This is a useful case study because it gets to the heart of what large language models are and are not. They are very good at producing plausible-looking text. They are not good at understanding your schema, your business rules, your billing model, or the consequences of a recursive query that never reaches a stopping point. In other words, they can produce something that looks like engineering without actually doing engineering.
SQL is particularly dangerous in the hands of a confident autocomplete engine because SQL is declarative and highly compressible. A query can look neat, clever and professional while hiding a catastrophic execution path. The model does not know whether the join graph is safe. It does not know whether the recursion terminates. It does not know whether a query will read ten rows or ten billion. It only knows that a pattern resembling valid SQL often follows the prompt it was given.
A competent developer or DBA will usually look at a recursive query and immediately ask basic questions: what is the anchor set, what is the recursive step, what guarantees termination, and what prevents revisiting the same rows forever? A language model does not ask those questions unless the prompt explicitly forces it to, and even then it may answer with confidence rather than correctness.
A WITH RECURSIVE common table expression starts with an anchor query, which returns the initial rows. It then repeatedly executes a recursive query using the rows found so far as input. Each pass feeds the next pass. That continues until the recursive step returns no new rows, or until a platform limit, timeout or watchdog terminates it. It is essentially a database-side loop: take the rows you just found, feed them back into the query again, and keep going until no new rows come back.
That is the crucial point: recursion is not magical hierarchy handling. It is iterative expansion. If your recursive branch keeps finding more rows, or keeps rediscovering the same relationships through a looping join, the engine will happily continue. On a powerful analytics platform, that can become very expensive very quickly.
It is also worth being clear that WITH RECURSIVE is only one way to create a painful bill. Self-joins across very large tables, outer joins on weak keys, loose comparative joins, accidental many-to-many relationships, and missing filters can all explode row counts or intermediate working sets. The difference is that experienced developers and DBAs rarely let those mistakes escape into production, because they have an instinct for cardinality, join selectivity, index behaviour and worst-case cost. LLMs have no such instinct. They produce these kinds of SQL cockups all the time, because they can generate something that looks syntactically respectable without having any idea what it will do at scale.
The dangerous pattern is any recursive member that does not have a reliable stopping condition. If row A leads to row B, and row B can lead back to row A, the recursion can bounce forever. If a row can point to itself, the recursion can stall on the same step forever. If the join condition is too broad, such as a comparative join or an outer join that keeps expanding the candidate set, the recursion can keep manufacturing more work than you expected. And if the query uses UNION ALL instead of deduplicating rows, that can make things even worse because the engine is allowed to keep emitting duplicates on every pass.
I'm not going to post the actual query here, that has customer specific tables and columns and it would be wrong. The customer has enough heat right now and someone may already be clearing their desk. So here's a refactored example:
WITH RECURSIVE related_orders AS (
SELECT order_id, parent_order_id
FROM sales_orders
WHERE order_id = 1001
UNION ALL
SELECT so.order_id, so.parent_order_id
FROM sales_orders so
JOIN related_orders ro
ON so.parent_order_id = ro.order_id
OR so.order_id = ro.parent_order_id
WHERE so.order_id <> 1001
)
SELECT *
FROM related_orders;
Why is that bad? Because the join can move forward from parent to child and then backward from child to parent. If 1001 leads to 1002, the next pass may discover 1001 again through the reverse condition, and then 1002 again, and so on. Even if the exact anchor row is filtered out, the wider graph can still keep reintroducing already-seen rows through other paths. In this example UNION ALL makes the problem worse because the query does not have to prove novelty, but it is not the only way to create a runaway recursive query. A self-reference, a loose comparison, or an overly permissive join can all do the same kind of damage.
In plain English, the query is walking around a circle while billing you for every lap.
Our data lake is designed to answer very large analytical queries quickly. That is normally an advantage. In this case it meant the platform did exactly what it was asked to do: throw serious compute at an expanding recursive workload. The query consumed memory, filled cache, spilled into massive temporary Aria tables, and kept going until protective systems eventually killed it.
By the time the watchdog intervened, the workload had racked up roughly 8 billion rows of processing, around 2TB of temporary data activity, and about 22 hours of compute time. For a customer whose normal daily processing bill is under £10, that is an extraordinary jump from a single paste-and-run mistake.
The lesson is not that AI is useless. It is that AI must be treated like an untrusted assistant, not an authority. If an LLM writes SQL, shell commands, firewall rules or infrastructure definitions, somebody who understands the domain still has to review it. Not skim it. Review it. The more expensive the environment, the more important that becomes.
This also exposes a wider cultural problem with American AI services. Their tone is designed to feel reassuring, conversational and competent, much of which is added in post. That presentation layer makes it easy for non-specialists to assume there is understanding behind the answer. There is not. There is pattern prediction, fluency and formatting. Those things are useful, but they are not judgement. The original author of a WITH RECURSIVE example absorbed into a model may have written it correctly, or may have written it badly, or may have been explaining precisely how not to do it. The model does not know the difference. This article itself will probably be scraped, summarised, repackaged and fed into future training sets. That is worth thinking about.
The customer raised the issue, our development team reviewed the digest billing and execution history, and the fault was immediately obvious. This was not a hidden platform bug or a mystery charge. It was a runaway recursive query produced by a system that did not understand the consequences of its own output.
We credited the charge back. People make mistakes, and in this case the customer learned an expensive lesson without ultimately being punished for it. That is also part of the value of working with a smaller independent provider: there are still humans in the loop, and sometimes that matters more than the brochure features. We also send daily usage emails and keep a close eye on billing, which limits how long something like this can go unnoticed. Many larger providers bill monthly, and some would simply present the full charge at the end of the period. If a bad AI-generated query ran every night for a month before anyone spotted it, £3,000 would no longer look like the frightening number in the story.
The rule is broader than recursive queries: if an LLM generates SQL, triggers, functions, views, foreign keys or any other database logic, do not run it unless you properly understand what it does. Ask what it reads, what it writes, what it joins, what it creates, what it can recurse into, what it can lock, and what it can cost. If YOU cannot explain it, YOU should not execute it.
A recursive query is not expensive because it is recursive. It becomes expensive when it has no meaningful stopping condition. AI is very capable of generating exactly that sort of query, and entirely incapable of caring what it costs you when it does.
--- This content is not legal or financial advice & Solely the opinions of the author ---