Token Monsters: The New Economic Playbook for AI Vendors
Anthropic announced its new model, Mythos, implying it has such extraordinary capabilities that they will not release it. The justification was that Mythos had discovered vulnerabilities in FreeBSD that were more than 27 years old. However, newer analysis suggests that this achievement is far less unique than advertised: even very small models appear capable of doing the same, and in one test, 8 out of 8 models succeeded.
That makes Anthropic’s claims of “superpowers” look much more like overhyped marketing than a genuine breakthrough. The obvious question, then, is: why make that claim?
The most plausible answer is timing. Alongside this narrative, Anthropic also launched Project Glasswing, a cybersecurity service. Taken together, both the tests and Anthropic’s product move suggest that the real moat is not the model itself, but the wrapper around it: the tooling, orchestration, deployment, and service layer.
This also helps explain the broader business logic. A large share of Anthropic’s revenue, like that of most LLM providers, comes from code generation - especially agentic code generation - which consumes a huge number of tokens. As these systems become more autonomous and run nearly nonstop, they burn through tokens at unprecedented speed, to the point that providers have had to introduce rate limits.
Cybersecurity has a very similar economic profile. Monitoring must be continuous, remediation must be continuous, and maintaining defences requires ongoing inference. In other words, it is another domain where AI systems can run persistently and consume tokens at scale.
Interestingly, the recent rise of tools like OpenClaw, Hermes, and other desktop-hosted, often insecure containerised agents serving as personal assistants reflects the same pattern. These continuously running services consume large numbers of tokens, and even ordinary consumers—at least those on the bleeding edge—appear willing to spend hundreds, sometimes even thousands, of dollars to automate parts of their lives. It is therefore no surprise that figures like Jensen Huang describe this as the future of computing: once again, the model is autonomous, persistent, and relentlessly token-intensive.
What we are seeing, then, is a shift in focus among LLM providers: away from selling pure intelligence as such, and toward selling applications that run continuously, are easy to automate, and generate sustained token consumption. From a revenue perspective, that is the most direct path to growth.
That may well become the playbook of the future for LLM providers.
But the more interesting question is whether it will remain their playbook alone.
My view is that it will not. If multiple models can deliver similar results, then this is not fundamentally a model problem; it is a harness problem. And harnesses are built in a much more democratised layer of the stack.
So I would not expect this strategy to benefit only the model providers. It is just as likely to benefit the companies building on top of them.
Agreed. The stronger constraint is not merely that the workflow is persistent, but that it is fed by dense, machine-generated event streams. That narrows the field considerably.
Here is a revised draft section:
Which application spaces really fit the shape of the token monsters?
If this is the new economic playbook, then the qualifying application spaces are narrower than they first appear. It is not enough for a workflow to be useful, repetitive, or even highly automatable. To become a true token monster, it must be driven by a very large volume of upstream events. The ideal domain is one in which data arrives continuously, often automatically, and where each new event can trigger interpretation, prioritisation, correlation, and action.
That is why code generation fits so well. Modern development environments already produce dense streams of machine-readable signals: code changes, test results, compiler errors, logs, CI events, dependency alerts, runtime traces, security scans, and infrastructure telemetry. Once an agent is allowed to sit inside that loop, it is no longer answering isolated prompts. It is participating in an event-rich system that can keep it running almost indefinitely.
Cybersecurity is perhaps an even purer example. Networks, endpoints, identity systems, cloud environments, and security tools generate vast quantities of alerts, logs, anomalies, and behavioural signals. Most of those events do not matter on their own. Their value emerges only through continuous correlation and triage. This makes cybersecurity an almost perfect token monster domain: high event throughput, constant monitoring, and a near-bottomless requirement for ongoing inference.
Financial trading and market intelligence belong in the same category. Prices, filings, news flows, order books, analyst notes, macro releases, social signals, and alternative datasets all arrive as machine-fed streams. The work is not simply to answer a question about them, but to maintain a live model of changing conditions. In such environments, inference becomes continuous because the world itself is updating continuously.
Large-scale health monitoring may prove to be another major case. Once systems are connected to wearables, remote sensors, medical devices, or population-level screening infrastructure, they begin to ingest high-frequency physiological and behavioural signals at scale. The opportunity is not merely diagnostic chat, but continuous interpretation across millions of small events: deviations, trends, anomalies, risk scores, and escalation decisions. Applied across a sufficiently large population, that too becomes a token-intensive operating model.
Industrial operations and machine fleets also fit the pattern. Factories, vehicles, robots, energy systems, and logistics networks all produce telemetry streams that can be monitored, interpreted, and acted upon in real time. Predictive maintenance, fault detection, routing optimisation, and operational orchestration are not one-off use cases. They are ongoing inferential workloads attached to environments that never stop producing data.
What links these domains is not simply automation, but event density.
The best token monster markets are those where software does not have to wait for a human to ask the next question. The next question is generated by the system itself, because the system is continuously emitting new facts that must be processed. That is when AI stops behaving like a tool and starts behaving like infrastructure.
And that may be the most important commercial distinction of all. The winners will not just be those with capable models, but those attached to the thickest, fastest, and most indispensable streams of machine-generated events.