With much of the AI industry still operating in a venture-funded growth phase, billions of dollars are being invested in companies like OpenAI and Anthropic but is this capital distorting pricing? And what are the implications (and mitigations), should this prove to be the case? Read on to find out more…
We’ve been here before…
Remember when Uber first launched? Cheap rides, a beautiful app - everyone wanted in and it appeared that the traditional taxi was out.
And pricing was cheaper while the service was heavily subsidised, but only because the goal was accelerated adoption in pursuit of market dominance. After that, prices rose: because the focus became profits rather than growth.
Now cast your mind to Deliveroo, DoorDash, Netflix, Lime… even cloud infrastructure like AWS too. Whatever the intricacies, the pattern is consistent: high fixed costs and heavy VC burn fuel artificially low user pricing that secures rapid adoption and behavioural lock-in - until the subsidy phase ends, and the real-world economics return.
Are we in AI’s subsidy phase?
Frontier AI providers such as OpenAI and Anthropic operate in an environment defined by extraordinary GPU demand, massive capital expenditure, intense competition for talent and the pressure to scale quickly. And for the past few years, artificial intelligence has felt impossibly cheap - as tech giants subsidise the cost of intelligence to win the ecosystem war.
But current API pricing for frontier models (GPT-4o, Claude 3.5, Gemini 1.5) does not reflect the massive capital expenditure required to build and power them and today’s token pricing may not reflect long term equilibrium costs.
“For every $1 hyperscalers earn from AI, they’re spending $12 building more capacity. That’s the bet embedded in $575 billion of capital expenditure this year.”
Tomasz Tunguz, Venture Capitalist at Theory Ventures
We’re operating in an artificial economic bubble where (just like with Uber rides) ‘reasoning’ is being sold below cost. Meaning that, if inference pricing were to increase two to three times over the next several years (whether due to profitability pressure, compute constraints or market consolidation) the implications would be significant.
And with many commentators now making observations about an impossibility of return on the capital invested (Tomasz Tunguz for example estimates a required 5 to 8x growth in five years1), if your business model relies on $0.01-per-token reasoning to be profitable, there is certainly reason to be concerned.
Could prices actually fall?
Of course it could be possible that AI becomes cheaper rather than more expensive.
Economies of scale, hardware improvements and competition could all potentially push costs down over time, but maybe whether prices rise or fall isn’t really the question. Perhaps instead we should be asking whether today’s pricing reflects a sustainable set of economics that can be relied upon to make decisions that last the long term.
Why the concern?
Startups are replacing engineering teams with prompts; enterprises are shrinking development budgets; solo founders are shipping products that once required entire departments. Is this sustainable?
The temptation to replace engineering teams with simple prompt interfaces is a bet on the permanent stability of today’s subsidy-phase pricing. But If the market economics fundamentally reorganise, what happens then?
Unchecked, and if AI pricing later normalises upward, the outcome is inevitable: fewer experienced developers in the market, a weakened junior talent pipeline, the loss of institutional knowledge and - ultimately - higher replacement costs.
And while taxis still existed after Uber prices rose, this entire market reorganising itself around temporary economics is an erosion of human capability that could take years to rebuild.
If we are in AI’s subsidy phase?
If AI pricing does change, the organisations most exposed will be the ones that removed their technical depth and/or that built their entire operating model around permanently cheap inference. There are however several ways to reduce the risk:
1. Maintain human capability alongside AI
AI can accelerate development dramatically, but replacing entire teams assumes that today’s pricing remains stable. Instead:
-
Treat human engineers as a critical architectural layer that ensures security, scalability and long-term maintainability.
-
Avoid total reliance on AI for entry-level tasks instead maintaining a hybrid model of engineers and AI tooling to create resilience.
-
Leverage elastic expertise with flexible partners like Deazy who help you scale velocity without hollowing out institutional intelligence.
2. Avoid dependence on a single AI provider
Many products today are tightly coupled to one provider but that creates pricing risk. Instead (and similar to multi-cloud strategies in infrastructure):
-
Prioritise architectures that allow switching between providers to reduce this exposure.
-
Employ abstraction layers and routing systems to make it possible to shift workloads based on cost, performance, or availability.
-
Specialise models for specific tasks, breaking dependency on a single provider’s ecosystem and ensuring you can swap components if pricing shifts.
3. Optimise AI usage and inference efficiency
Many AI systems are far less efficient than they need to be with large prompts, repeated calls and poorly designed workflows multiplying costs dramatically. But efficiency compounds quickly at scale so remember:
-
Companies that treat tokens as a scarce resource build much more efficient systems.
-
Reducing prompt size, caching responses, batching requests and using retrieval rather than large context windows can significantly reduce inference costs.
-
Agentic workflows break complex problems into smaller, verifiable steps, facilitating ‘early exit’ strategies where the process can be completed by a simple logic gate or cheaper model - before it ever reaches a high-cost frontier model.
4. Where possible, use smaller or specialised models
Frontier models are powerful but expensive and many tasks do not require them. Instead…
-
Classification, extraction, summarisation, and structured workflows can often be handled by smaller models at a fraction of the cost.
-
Fine tuning smaller open source models like Llama 3 or Mistral can deliver a model that is faster, more accurate and significantly cheaper to run.
-
A tiered model strategy that categorises tasks by reasoning depth often achieves significant savings. Think small models for routine ‘utility’ tasks, mid-tier models for reasoning workflows and frontier models only when necessary.
5. Shift to owned intelligence over time
While the industry fixates on frontier wars, the most resilient organisations are decoupling their core logic from third party APIs for strategic infrastructure independence. Consider this and take on board that control over compute can translate into control over cost:
-
80% of enterprise workloads no longer need a massive, general-purpose model but a specialised, right-sized one.
-
Self hosting smaller models can reduce dependence on external inference pricing, converting unpredictable opex into a stable, manageable utility.
-
Moving data to your own model is often safer and faster than moving it to an external API, drastically reducing your compliance and security surface area.
Over to you…
Without a doubt, the ride is currently cheap and the convenience is undeniable. But as AI prepares to turn a profit, the question isn't really whether the price will go up, it’s whether you’ve built an engineering engine efficient enough to keep driving when it does.
History shows that once market dominance is established and the cheap capital era ends, prices inevitably normalise. Which, for those organisations building products purely on top of these APIs, represents a massive unhedged systemic risk.
At Deazy, we believe adaptation (and success) lies in a shift in mindset, and that the winners over the next five years won’t be the companies that prompted their way to a product, but the ones that used this subsidised period to build durable AI engineering foundations within a sophisticated, human-led architectural stack.
So…
-
Stop treating AI as a black-box utility and start treating it as a component of a sophisticated, multi-layered architecture.
-
Protect your most valuable asset: your human engineering talent (they’re the ones who will re-wire your systems when the APIs change or the prices spike).
- Own your intelligence by move toward specialised, fine-tuned and open-source models that you control, rather than remaining a tenant on someone else's platform.
Anything else? We’d love to hear from you.
About Deazy
Deazy enables ambitious organisations to explore and harness AI to drive digital product innovation and operational efficiency, applying our award-winning AI and software delivery expertise to solve complex challenges, accelerate innovation and build resilient digital platforms that scale.
With a uniquely flexible delivery model, we provide rapid access to a diverse pool of 6,000+ experienced nearshore AI, software, and data professionals, managed by highly-experienced and multidisciplinary in-house product and delivery experts who provide the support and resources to guarantee success.
For support with your AI development challenges, or if you’d like to explore where you are across the five production gates, drop us a line at hello@deazy.com. We’d love to chat.
1 Tomasz Tunguz, Venture Capitalist at Theory Ventures