AEO Experiment Design

"Most best practices, most blog posts are not correct. So how do you set up an experiment? You get your questions, you turn tracking on, give it a couple of weeks. Make your changes, have a test group, have a control group. Intervene on the test group, make your changes, see if the chart went up." - Ethan Smith

What It Is

AEO Experiment Design is a rigorous methodology for testing whether Answer Engine Optimization tactics actually work. Because most SEO and AEO advice online is untested misinformation, this framework helps you validate tactics before scaling them.

The core problem: in SEO and AEO, most information gets repeated without analysis. Someone says something, it becomes "best practice," and no one ever validates it. The solution is structured experimentation with proper controls.

How It Works

Why Control Groups Matter for AEO:

LLM answers have natural variance—asking the same question produces different answers
LLM adoption is growing rapidly, so traffic increases even without intervention
External factors affect results (algorithm changes, competitor activity, seasonality)

Without a control group, you might falsely attribute natural variance to your intervention.

Experiment Structure:

Questions Pool: 200 questions

Control Group: 100 questions
- No intervention
- Track for baseline variance

Test Group: 100 questions
- Apply specific intervention (Reddit comments, YouTube videos, etc.)
- Track same metrics

How to Apply It

Step 1: Setup (Week 1-2)

Select 200+ target questions
Set up tracking across LLM surfaces
Establish baseline metrics for 2+ weeks
Randomly assign to test and control groups

Step 2: Intervention (Week 3-4)

Apply one specific tactic to test group only
Examples of testable interventions:
- Comment on Reddit threads for test questions
- Create YouTube/Vimeo videos
- Pay affiliate to include product in lists
- Optimize landing page content

Step 3: Measurement (Week 5-6)

Compare test vs control group changes
Key metrics:
- Share of voice (% of time appearing in answers)
- Average rank when appearing
- Citation frequency

Step 4: Validation

Did test group improve more than control?
Is the effect size meaningful?
Reproduce the experiment to confirm results

The Reproducibility Requirement

"For something to truly be accepted with an academian, it needs to be reproducible. Multiple people have done this study and reproduced that thing over and over again. Especially in SEO, it's common for something to change and you think that it was this thing that caused it and it's actually not."

Run the experiment multiple times
Try to reproduce results from other sources
If a tactic works 10 times, it probably works

What to Test

Tactic	Intervention	Measurement
Reddit	Comment on threads in test group only	Share of voice change
YouTube	Create videos for test questions only	Citation frequency
Affiliate	Pay for inclusion on test topics only	Rank improvement
Landing pages	Optimize pages for test questions only	Appearance rate
Help center	Add content for test use cases only	Long-tail coverage

When to Use It

Before scaling any AEO tactic
When evaluating advice from blogs or consultants
When trying new citation sources
After algorithm changes to validate tactics still work

Warning signs you need better experiments:

You're scaling tactics you've never validated
You're following "best practices" without evidence
Results seem to change randomly
You can't explain why something worked

Source

Guest: Ethan Smith
Episode: "The ultimate guide to AEO: How to get ChatGPT to recommend your product"
Key Discussion: (00:32:00 - 00:34:00) - Experiment design and reproducibility
YouTube: Watch on YouTube

Related Frameworks

Answer Engine Optimization (AEO) - The complete AEO framework
Pre vs Post Measurement - Alternative measurement when sample size is limited
Shortening Feedback Loops - Finding faster signals