AI’s Domestication Problem
Enterprise AI will not cross the production gap until companies learn how to manage non-human intelligence.
A useful recent piece on the “AI production gap” described the graveyard between the impressive weekend demo and the enterprise system that survives Monday morning.
The argument was that AI pilots fail because companies accumulate five kinds of debt: technical, operational, evaluation, integration, and governance debt.
That framing is right.
But there is another layer underneath it.
Most discussions about enterprise AI still treat the problem as if it is mainly about software becoming production-grade. Better orchestration. Better evaluation. Better monitoring. Better audit trails.
All necessary.
But AI is no longer being sold merely as software. It is being sold as an assistant, agent, analyst, SDR, operator, teammate — in other words, something suspiciously close to an employee.
And that changes the benchmark.
For AI to work inside companies, it cannot simply be impressive in a demo. It has to become as easy to work with as Fred from sales — the ever-smiling, mildly chaotic, but fundamentally usable human who follows up, escalates when needed, reads the room, and can be corrected over coffee.
Fred may not understand the architecture. But Fred knows when the client is annoyed.
AI, for all its intelligence, does not yet come with those office instincts.
So the production gap is not only a system problem. It is a management problem.
Companies already know how to manage flawed intelligence
Organizations are not new to unreliable intelligence. They have been hiring it for centuries.
Humans are inconsistent, political, tired, insecure, brilliant on Tuesday, useless on Friday, and occasionally dangerous after one promotion too many.
But companies have rituals for containing human inconsistency.
Hiring filters. Probation. Onboarding. Managers. Peer review. Escalation chains. Performance reviews. HR warnings. Training. Coffee chats. Gossip. Shame.
This entire messy operating system exists because everyone already knows that human beings are not deterministic machines.
Nobody expects Fred from sales to be perfect. The expectation is that Fred can be managed.
He can be corrected.
He can be coached.
He can be told, “Don’t say that to this client again.”
He can be asked, “Why did you promise delivery in two weeks?”
He can be taken aside after a bad meeting and given the sacred corporate sacrament: feedback over coffee.
AI enters the organization with intelligence, but without these rituals around it.
That is part of the missing debt.
Not technical debt alone. Not operational debt alone. Something more basic.
Call it organizational domestication debt: the gap between an AI system’s raw capability and the organization’s ability to make that capability safe, useful, supervised, corrected, trusted, and gradually more autonomous.
The buyer does not want to build the machine room
The technical debt argument is correct. Production AI needs routers, retries, fallback models, API handling, state management, observability, cost control, security, and all the other unsexy plumbing that makes software survive contact with reality.
But from the buyer’s side, this creates a different problem.
A client who buys a CRM does not expect to build lead routing, permission systems, activity logs, workflow rules, and API exception handling from scratch. That is why the CRM exists.
SaaS trained buyers to expect the machine room to be hidden.
So when enterprise AI requires the buyer to think about routers, model fleets, fallback chains, prompt versioning, hallucination handling, evaluation harnesses, and governance workflows, the buyer is not necessarily blind to the debt.
He simply refuses to become the one carrying it.
That is a very reasonable refusal.
The buyer is not saying, “This complexity does not exist.”
The buyer is saying, “Why is this my problem?”
This is why many AI pilots get stuck after the demo. The demo sells intelligence. Production requires infrastructure. But the buyer thought he was buying an outcome.
He thought he was getting Fred.
Instead, he got a brilliant intern with exposed wiring.
Humans come with survival firmware
Operational debt is usually framed as ownership: who monitors the AI system, who updates prompts, who checks drift, who handles escalation, who maintains the thing after launch.
Again, correct.
But there is a stranger layer underneath.
Humans also require maintenance. People drift. People misunderstand. People get bored. People get political. People forget the SOP. People optimize for looking useful rather than being useful.
But humans participate in their own maintenance.
A confused employee asks questions. A scared employee escalates. A politically aware employee reads the room. A hungry employee protects salary, status, belonging, and the right to remain inside the tribe.
The human worker comes with survival firmware.
AI does not.
It does not fear being fired. It does not want promotion. It does not notice when the boss’s silence means, “You are in trouble.” It does not feel the temperature drop in a meeting. It does not overhear the corridor truth after the official truth.
It has language, but no office survival instinct.
No shame. No ambition. No fear of being left out of the WhatsApp group.
This matters because a lot of organizational reliability is not written down anywhere. It is carried through social pressure, imitation, fear, incentives, jokes, glances, tone, memory, and the thousand small signals by which humans keep each other aligned.
AI does not naturally live inside that web.
So operational debt is not only the question, “Who monitors the AI?”
It is also the question, “Who supplies the survival instinct that the AI does not have?”
A 95% accurate employee is a terrifying employee
Evaluation debt is where the problem becomes most visible.
To a technology team working on a cutting edge capacility like AI, a 95% success rate may sound impressive. In management, it sounds dangerous.
Imagine telling a manager:
This new hire is brilliant 95% of the time. The remaining 5% of the time, he may confidently invent a policy, misread the client, send the wrong file, or produce an answer that looks perfect until it ruins your week.
That manager does not hear “95% accurate.”
He hears “unemployable unless closely supervised.”
This is not because human beings are perfectly consistent. They are not. Humans go off the rails all the time. They panic, hide mistakes, overpromise, misjudge, flatter power, and occasionally hallucinate entire strategies in conference rooms.
The difference is that human inconsistency is socially legible.
A manager can understand a tired analyst. A manager can understand a junior salesperson trying too hard. A manager can understand someone becoming defensive because their status is threatened.
But AI inconsistency is alien.
A system can draft a beautiful memo at 10:01 and invent a non-existent policy at 10:07 with the same calm, polished confidence.
That breaks the managerial brain.
Corporate hiring is, in many ways, a giant consistency filter. Five years here. Ten years there. References. Past roles. Promotions. Degrees. Performance history. All of these are proxies for one basic question:
Can this person work reliably inside an organization without burning the place down?
AI enters with a different profile.
It can be superhuman in one task and strangely brittle in the adjacent one. It can sound confident without being grounded. It can produce fluency without responsibility.
So evaluation cannot remain a purely technical dashboard problem.
Managers do not only need benchmark scores. They need performance legibility.
Where is this AI strong?
Where is it weak?
When does it overreach?
When should it escalate?
What kinds of errors does it repeatedly make?
Can it improve after correction?
Can it be trusted with more autonomy, or should it remain on probation?
In other words, AI evaluation needs to feel less like academic benchmarking and more like performance management.
The real API is usually a human being
Integration debt is usually described as a workflow problem. The AI system is useless if it sits outside the tools people actually use: CRM, ERP, email, Slack, WhatsApp, documents, spreadsheets, calendars, ticketing systems, databases, and whatever other fossilized software stack the company has accumulated over the years.
Again, correct.
But the strange thing is this: companies already have integration layers everywhere.
They are called people.
A human employee reads an email, opens Excel, checks the CRM, messages accounts, interprets a screenshot, calls Rajesh from operations, updates the ERP, forwards the file to legal, and then tells the boss, “This does not match what the client said.”
That is integration.
We just do not call it integration because it comes wrapped in a salaried mammal.
In most companies, the real integration layer is not software. It is some poor bast**d with three browser tabs, two WhatsApp groups, and institutional memory.
This is not an insult. This is how companies actually run.
So when AI struggles with integration, the problem is not simply that the APIs are missing. The deeper problem is that AI has not yet inherited the informal integration powers of human workers.
Humans bridge broken systems all day.
They remember exceptions.
They know that the data in the ERP is technically correct but practically useless.
They know that the “final_final_v7” file is the real one.
They know that procurement says no until someone senior calls.
They know that the SOP is outdated but still politically sacred.
If AI is being sold as a worker, then either it must be deeply integrated into systems, or it must be given human-like tool access and onboarding.
Ideally both.
Otherwise, it remains a clever island.
And clever islands do not transform companies.
The audit trail is where the weekend demo dies
Governance debt is the point at which the fantasy becomes expensive.
A weekend demo can be charmingly vague. A production system cannot.
The moment AI touches money, customers, regulated decisions, safety, compliance, contracts, financial reporting, hiring, insurance, medicine, carbon credits, or anything else with consequence, intelligence is not enough.
The organization needs traceability.
Who instructed the AI?
What data did it use?
Which source did it rely on?
What rule did it follow?
Who approved the action?
What changed after review?
Can the decision be explained six months later to a regulator, auditor, customer, board, or judge?
This is where AI stops being a clever tool and becomes organizational infrastructure.
The audit trail is where the fantasy of the lone AI builder dies.
Because governed AI is not one person plus an API key. It is product, operations, legal, compliance, security, domain expertise, monitoring, and management.
That does not mean AI cannot move fast.
It means the speed has to leave footprints.
The missing debt: organizational domestication debt
The original five debts are useful because they describe what breaks when AI moves from demo to production.
But underneath them is a more primitive question:
Can this strange intelligence be made livable inside an organization?
That is organizational domestication debt.
It is the missing layer between capability and responsibility.
It asks:
How is the AI onboarded?
Who manages it?
How is it corrected?
When does it escalate?
How does it earn trust?
How is autonomy increased?
How are mistakes reviewed?
How do humans know what kind of creature they are dealing with?
We have domesticated humans into corporations.
This sounds ridiculous until one remembers what a corporation is: a machine that takes strange, emotional, ambitious, status-sensitive primates and turns them into quarterly reporting units.
That machine has taken centuries to build.
Now AI is entering the same machine.
But unlike Fred from sales, AI does not arrive with fear, hunger, ambition, shame, gossip, charm, or the deep mammalian need to remain useful to the group.
It arrives with intelligence.
And intelligence is not the same thing as employability.
AI needs management rituals
The next generation of enterprise AI will not be judged only by model quality.
It will be judged by how well it can be managed.
Can it be corrected without a developer?
Can a manager tell it, “Do not do this again,” and have that instruction become part of future behavior in a controlled, auditable way?
Can it say, “This is outside my confidence; escalating to a human”?
Can it show its work without drowning everyone in logs?
Can it be restricted in high-risk situations and trusted in low-risk ones?
Can it move from intern to analyst to operator without pretending it was a vice president on day one?
This is the management layer enterprise AI is missing.
Not just dashboards.
Not just logs.
Not just eval scores.
Management rituals.
The ability to coach, correct, review, restrict, promote, demote, and retire AI from tasks where it keeps failing.
That is what will make AI usable inside companies.
The production gap is not only the distance between demo and deployment.
It is the distance between intelligence and responsibility.
And that distance will not be crossed by models alone.
It will be crossed when AI stops behaving like impressive software and starts becoming something organizations know how to manage.
Fred may still win for now.
Not because he is smarter.
Because he can be taken for coffee.
The Article that got me thinking.


This is a really interesting post, Abhimanyu. However, I wouldn't frame this as a debt -- because companies themselves can't fix that alone. These AIs may not fear getting fired, but they sure fear that nobody uses them. See my recent article here about who your AI agents are really working for: https://newsletter.wangari.global/p/who-is-my-ai-agent-really-working