Part 3 of 3: How to test, measure, and improve AI search visibility when the dashboards don’t exist yet, and how to spot the hype without substance.
In Part 1, I argued you should pivot your SEO program into AEO rather than abandon it. In Part 2, I walked through the workflow: intent groups, prompt generation, tracking, and the tactics that serve SEO and AEO simultaneously. If you’ve done both, you now have a baseline, a set of tracked prompts, a set of tracked keywords, and a portfolio of optimized pages.
Now comes the hard part: figuring out what’s actually working.
This is where most AEO strategies fall apart. The landscape is evolving in real time. There is no ChatGPT Analytics. There is no “Prompt Planner.” The major LLMs are, if anything, more of a black box than search engines ever were. And the marketing industry has already produced a healthy population of gurus happy to sell you the magic formula.
The only way through is discipline…and the scientific method (see also, discipline).
This has happened before
It’s worth remembering that SEO was once exactly this kind of unknowable black box. Ranking factors were rumored, not documented. Algorithms updated without warning. Careers were built on guessing correctly about what Google wanted. The industry eventually matured because a generation of practitioners did controlled experiments, shared data, and refused to accept unfalsifiable claims.
AEO is going to go through the same cycle. The teams that build a durable advantage are the ones that stay disciplined now, while everyone else chases clickbait and headlines. The gold rush is not about who writes the most content or builds the most backlinks. It’s about who figures out what actually moves the needle, before anyone else does.
That’s a measurement problem. Here’s how we approach it.
Step 1: Identify your metrics and set a baseline
If you followed Part 2, this is done. Your SEO keywords are in an SEO tracking tool. Your prompts, grouped by intent, are in an AEO tracking tool. Visibility is being recorded on a schedule (weekly is fine).
That’s your baseline. Everything you do from here gets measured against it.
It’s worth being explicit about what “visibility” means in the AEO context. These tools have a machine run your prompt against the major AI search tools and record whether your brand is cited, mentioned, linked, or otherwise surfaced in the response. It’s imperfect, because prompts are variable and LLM outputs may vary. That’s why I recommend you track three prompts per intent group: variation dampens the noise. It’s the best measurement available right now.
Step 2: Lock your variables
This is the step that separates useful experiments from wasted effort. When you’re operating in an environment where the algorithms change without warning, where the same prompt can produce different answers hour to hour, and where you don’t have direct attribution data, the only way to learn anything is to change one thing at a time.
The most common mistake I see is teams doing ten things at once, watching the numbers, and claiming a win when something moves. That’s not an experiment. That’s a soup. There is no way to know which of the ten things caused the effect, or whether the effect would have happened anyway because of an algorithm update.
Good experimentation looks like this:
- A written hypothesis. “We believe adding FAQ schema to pages will improve AEO visibility within 30 days.”
- A specific change, applied to a defined set of pages.
- A control group: pages where the change was not made.
- A defined measurement window.
- A willingness to report that it didn’t work, if it didn’t.
If you wrote ten pieces of content for ten different intent groups, you should be able to tell me, specifically, whether SEO visibility moved for those intent groups and whether AEO visibility moved for them. And you should be able to compare that against intent groups where you did nothing, to rule out a tailwind from an algorithm update.
Complexity is the enemy here. The more variables in play, the less you’ll learn.
Step 3: Trust only the data
This is both a measurement principle and a business principle.
The industry is going to get flooded (it already is) with confident claims about how to win AI search. Flashy headlines. Hyperbolic language. “AEO secrets.” “The one thing ChatGPT looks for.” Most of it won’t survive contact with a controlled experiment, because most of it wasn’t derived from one.
When you’re evaluating an agency, a consultant, or a tool, the question to ask is: Can you show me the controlled experiment? Not the case study where “we did a bunch of stuff and the numbers went up.” That’s not evidence. That’s a coincidence with a narrative wrapped around it. A real experiment isolates the change and reports the lift attributable to it.
The corollary is: don’t trust your own results either, if they don’t come from a locked experiment. It’s very easy, especially in a fast-moving channel, to convince yourself a tactic is working because you want it to be working. The data you can defend is the data you earned through isolation and controls.
Don’t let the fear win
There’s a particular flavor of sales tactic that shows up any time a new channel emerges. It’s fear-based. You’re falling behind. AI is taking over. Your competitors are already there. You need to act now or you’ll be invisible forever. This is marketing, not reality.
Every major technology goes through adoption cycles. Every one of them takes time. AI search is going to reshape how people find information, but not instantly and not uniformly. You have time to be diligent. You do not have to be desperate. Those are different things.
Move quickly and move deliberately. Don’t panic-buy tactics because someone on LinkedIn told you the sky is falling.
Trust your gut
The last principle may be the most important, and it’s not technical. It’s about how you evaluate the people you work with.
This stuff is complex, but not that complex. It’s explainable. Any agency, consultant, or tool vendor worth working with should be able to walk you through their methodology in language you actually understand. You are the customer. The burden of clarity is on them, not on you.
If someone is explaining their AEO approach and it doesn’t click, keep asking questions. Ask why. Ask what the hypothesis is. Ask what the control group looks like. Ask how they’d know if it didn’t work.
If it still doesn’t click after honest effort, the issue probably isn’t that you’re missing something. The issue is probably that there’s nothing there. Smoke and mirrors. A lot of bad actors in emerging fields rely on intimidation as part of the sales pitch: make it sound complicated enough that the customer feels stupid for asking, and they’ll stop asking. Don’t let that happen. If it doesn’t make sense in your head, it doesn’t make sense. Digital marketing is still a service, and good service makes the customer feel comfortable.
The mindset
AEO is new. The rules are unwritten. The teams that will win are the ones who treat this as a scientific problem rather than a marketing problem. Set baselines. Isolate variables. Run hypothesis-driven experiments. Trust the data over the narrative. Be patient enough to wait for real signal, and skeptical enough to dismiss magic formulas.
And always remember that the reason you’re able to do any of this is because you didn’t abandon your SEO program when everyone else was panicking. You pivoted it. That foundation is what makes the experimentation possible in the first place.
This is Part 3 of a three-part series on pivoting SEO into AEO. Part 1 made the case for why the pivot matters. Part 2 covered the intent-group methodology and the tactics we deploy. Part 3, this piece, is the measurement discipline that makes the whole thing work.