The story I keep hearing
Most CMOs and VPs of Content I talk to are living inside some version of the same story.
They bought the tools. They subscribed the team. They watched a demo, read a colleague's deck, scanned a McKinsey one-pager. The numbers were clean: thirty percent faster, forty percent more output, sixty percent lower cost per piece. They authorized the spend.
Six months later, the line on the P&L is up. The output line on the dashboard is roughly flat. The team is busy. The team is also tired. They can't quite say what they got for the money. The board asks. They answer with anecdotes. They go home and wonder whether this is going to be the budget line that doesn't survive Q1.
This is the most common arc in AI content rollouts I'm seeing right now. And almost everyone who lives through it lands on the same conclusion: we bought the wrong tools.
I don't think that conclusion holds up. I want to spend the rest of this piece telling you why, and what I think the actual story is.
So why doesn't the spend land?
The tools are mostly fine. The problem is structural, and it shows up at almost every team that runs through this arc.
A content team can sit at five different levels of AI maturity. Only one of those levels can actually metabolize the kind of investment a serious tooling spend assumes. When a team at Level 2 spends against Level 4 tooling, the money doesn't land. It can't land. The team doesn't yet have the standards, the workflow integration, or the measurement to absorb the capability it just paid for. The tools work fine. The operation around them isn't ready.
That's the gap. Not the tools, not the team, not the budget. The mismatch between where the team actually is and where the spend assumes it is.
And honestly, when I look at how this category sells right now, I don't blame anyone for falling into the gap. The decks all show Level 5. The implication is that the right tooling gets you there. The intermediate stages, where the real work lives, never make it onto the slide. Most CMOs I talk to bought rationally based on the information they had. The information was thin.
Once you see this pattern, the dead-end AI rollout stories all start to look the same. The team bought a workflow tool before they had a workflow. They bought a measurement platform before they had a standard worth measuring against. They invested in enablement before they had anything to enable people to do. The category sells solutions for problems that arrive two levels after the customer is currently positioned to solve.
The five levels
Here they are, in order. Most readers will land one level lower than they expect, and that's probably the most useful thing this article can do for you.
Level 1: Ad hoc
Individuals on your team are using AI on their own initiative. There's no team-wide standard for what to use, when, or how. The intern is leaning in. The senior writer is refusing. Nobody is wrong, because there isn't a right answer yet.
What breaks. You can't answer basic questions about your own production: what's AI-assisted, what isn't, what's allowed. Brand-voice drift is happening, and you can't see it. The cost stays invisible until something embarrassing ships, or the board asks for evidence the tool spend is paying back, and the honest answer is that nobody knows.
Level 2: Stated intent
Leadership has declared AI is part of the strategy. There's a Slack channel. Maybe a one-page policy. Adoption is uneven. Some people use the tools, some quietly don't. Tools were chosen reactively, based on whatever was trending the month somebody had budget.
What breaks. You tell the board you're "using AI" and the claim doesn't survive five minutes of scrutiny. Subscriptions accumulate without being cut. Output volume hasn't meaningfully moved. The team that was supposed to get faster is now also responsible for a tooling sprawl that nobody owns.
Level 3: Standardized tools, unintegrated workflow
The team has agreed on the stack. The style guide acknowledges AI. But the tools sit beside the production workflow. Writers paste in and out manually, and there's no real measurement of contribution. AI is a feature on the side of the desk, not part of how work moves through the building.
What breaks. You've hit a throughput ceiling that more tools won't break. Quality variance is still high, because the tools function as checkboxes rather than capacity. You spent the money, you trained the team. You can defend the investment. You can't yet show what it bought.
Level 4: Integrated workflow, distributed enablement
AI is embedded in production stages: research, drafting, editing, distribution. Editorial standards explicitly cover the human and AI division of labor. The team is trained. There's a recognized AI lead. Some measurement is happening, mostly ad hoc. The work has changed shape, and the team has changed with it.
What breaks. Tooling is now tangled enough that refactoring feels expensive. Institutional knowledge lives in two people's heads. If they leave, you regress to Level 2 in a quarter. The risk at Level 4 isn't stagnation. It's brittleness.
Level 5: Operating discipline
AI integration is a managed function, not a project. Stages are instrumented for cycle time, quality scoring, and attribution. Tooling gets re-evaluated on a known cadence. Enablement is continuous, not event-driven. The team is producing at a level that would have required twice the headcount three years ago, and it's defensible.
What breaks. Complacency. The risk at Level 5 isn't underutilization. It's assuming you've solved a moving target. The category changes every quarter. The team that stops re-evaluating is six months from being passed by the team that doesn't.
How to find your level
The level you land on is determined by six stages, evaluated together. Each stage is a different lens on the same question: is this team's AI use actually part of how the team works, or is it adjacent to how the team works?
- Audit. Visibility into current AI use. Can you actually say what your team is doing with AI, or are you guessing?
- Standards. The editorial and brand quality bar. Does it exist on paper? Does it mention AI? Is it enforced in the review process?
- Workflow. Where AI sits inside production. Beside the work, inside specific steps, or integrated end to end with explicit human checkpoints?
- Tooling. Stack coherence and discipline. Is the stack consolidated, sanctioned, and re-evaluated on a cadence, or is it whatever individuals signed up for?
- Enablement. Team capability and onboarding. Is AI capability concentrated in one or two people, or is it a function the team operates?
- Operate. Measurement and feedback loops. Can you answer "did AI integration pay off last quarter?" with anything more substantive than anecdotes?
Each stage gets a score on its own. The overall level is the average across the six. Which means the level is the headline, but the lowest-scoring stage is where the real diagnosis lives.
A team scoring Level 3 overall with a Level 1 score on Standards isn't really a Level 3 team. It's a Level 1 team with a Level 5 tooling spend covering the gap. The level is misleading. The stages tell the truth.
This is the part of the diagnosis most CMOs skip, and it's the part that determines what to do next.
What to do with this
Three moves, in order. Resist the temptation to skip to step three.
1. Find your level honestly. Not the level you want to be at, not the level your board deck implies. The level your team would land on if a stranger scored you against the six stages. If you're unsure, run the assessment. If you're sure, ask a peer to score you anyway. The most common error in this work is a half-level of self-grade inflation, and it's not vanity. It's that you're scoring against the level you've worked hard to get to, not the level you've actually arrived at. Easy mistake to make.
2. Find your lowest-scoring stage and start there. The temptation at every level is to invest in the next level's headline capability. Don't. The thing that gates your progress is the stage you're worst at, not the level you'd like to claim. A team at Level 2 with weak Standards doesn't need better tools. It needs an editorial bar that AI output has to clear. A team at Level 3 with weak Operate doesn't need more integration. It needs to measure what the integration is actually producing.
3. Move one level at a time. Each transition is real work. Months, not weeks. Teams that try to jump from Level 2 to Level 4 in a single budget cycle either regress to Level 2 within the year, or stall at Level 3 with a Level 4 invoice. The work between levels is sequential because the capabilities are dependent on each other. You can't integrate AI into a workflow you haven't standardized. You can't measure quality against a bar you haven't written down.
What I think the category has wrong
For three years now, the AI content conversation has been about the destination. The decks all show Level 5: the instrumented, integrated, defensible operation. The implication is that any team that buys the right tools can be there next quarter.
The work is at the intermediate stages. It's the unglamorous part. Writing down the editorial bar. Agreeing on the stack. Designing the workflow with explicit human checkpoints. Building the measurement layer so the next investment can actually be evaluated. Every team I've seen end up at Level 5 got there one level at a time. Every team I've seen buy the Level 5 toolset before doing the Level 3 work is still at Level 2, with a bigger invoice.
The good news is the ladder works in both directions. The teams that take the levels seriously, that diagnose honestly, that work the lowest stage first, that move one level at a time, are pulling ahead of the teams that don't. And they're pulling ahead faster than the tooling alone would predict.
If you're somewhere on this ladder and you don't love where you are, you're in good company. Almost every team I work with starts the same way. The path forward exists, and it's more knowable than the category has made it sound. That's the entire reason I'm writing this series. Find your level. Find your lowest stage. Take the next step. We can talk about the one after that when you're ready.
This is the first piece in an ongoing series. Subsequent pieces go deeper into each level, what it looks like from the inside, the specific failure modes, the move to the next level, and into each of the six stages, where the actual diagnostic value lives.