Generative Engine Optimization at Scale: Measuring Brand Visibility Across AI Search Engines

Pratyush Kumar (Ranqo)

arxiv: 2606.20065 · v1 · pith:CN6XGJXKnew · submitted 2026-06-18 · 💻 cs.IR · cs.CL· cs.CY

Generative Engine Optimization at Scale: Measuring Brand Visibility Across AI Search Engines

Pratyush Kumar (Ranqo) This is my paper

Pith reviewed 2026-06-26 15:43 UTC · model grok-4.3

classification 💻 cs.IR cs.CLcs.CY

keywords generative engine optimizationAI search visibilitybrand visibilitythree-tier laddercitation sourceslisticle contentsentiment instabilityAI answer engines

0 comments

The pith

AI search engines display brands according to a three-tier visibility ladder based on stature.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper measures brand visibility across AI search engines using over 100,000 prompt responses from more than 100 brands. It finds that appearance rates form clear tiers: 73 percent for global household names, 44 percent for mid-market brands, and 11 percent for niche ones. Corporate websites dominate citations at 78 percent, with best-of listicles as the top content format. Sentiment around brands changes much more readily than whether they are mentioned at all. This baseline matters because it shows how visibility in AI answers varies strongly by brand maturity and provides protocols for testing improvements.

Core claim

First visibility runs form a clear three-tier brand-stature ladder where global household names appear in 73% of relevant AI answers, established mid-market brands in 44%, and niche brands in 11%. When citing sources, 78% are corporate websites, YouTube leads non-corporate sources, and ranked best-of listicles account for 21% of citations. Sentiment framing flips 6.7 times more often than mentions themselves.

What carries the argument

The three-tier brand-stature ladder measured from first visibility runs on 100K+ prompt responses.

If this is right

AI brand visibility differs by platform and brand maturity.
The highest-leverage content format is the ranked best-of listicle.
Sentiment is an unstable signal compared to mere mention.
Seven v1.1 protocols can test whether specific changes improve AI visibility.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Marketers for smaller brands could prioritize getting featured in listicles to boost visibility.
Different AI engines may require tailored strategies due to platform differences.
Tracking visibility separately from sentiment could give a more stable view of presence.

Load-bearing premise

The prompts used in the analysis are representative of typical user queries to AI search engines and the brands tracked are a fair sample across tiers without selection bias.

What would settle it

Repeating the analysis with a fresh set of prompts or a broader, independently selected group of brands would produce different tier percentages or citation patterns.

read the original abstract

People increasingly get answers straight from AI search engines like ChatGPT, Claude, Perplexity, and Gemini rather than scrolling search results. Brands that once focused on search engine optimization (SEO) must now optimize for how these engines represent, cite, and recommend them -- a shift variously called Generative Engine Optimization (GEO), Answer Engine Optimization (AEO), and AI Search Visibility. We treat AEO and AI Visibility as part of GEO, and study how to measure brand visibility across AI engines: what they value when they cite a brand, which sources they rely on, and what content large language models surface. The hard case is everyone outside the already-authoritative top brands -- SMEs, D2C brands, creators, and early-stage startups. We analyze 100K+ prompt responses across 100+ brands tracked on Ranqo between March and May 2026. First visibility runs form a clear three-tier brand-stature ladder: global household names (e.g., Stripe, Nike) appear in 73% of relevant AI answers on their first run; established mid-market and regional brands (e.g., Olipop, Klaviyo) in 44%; niche and small brands in just 11% -- about 30 percentage points per step. When engines cite sources, about 78% go to corporate websites; among non-corporate sources YouTube leads, ahead of Reddit, editorial media, and Wikipedia. The highest-leverage page is the ranked "best-of" listicle, the most-cited content format at about 21% of all citations. Sentiment is the unstable signal: whether a brand is framed positively or negatively flips about 6.7 times more often than whether it is mentioned at all. These findings provide a first large-scale baseline for measuring GEO: AI brand visibility can be measured, differs by platform, and varies strongly by brand maturity. We close by proposing seven v1.1 protocols to test whether specific recommendations can causally improve AI visibility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

First large-scale counts on AI brand visibility, but Ranqo sample likely confounds the 73/44/11 tier gaps.

read the letter

The paper's core contribution is a set of descriptive baselines from 100k+ AI engine responses across 100+ brands: global names show up in 73% of relevant answers, mid-tier in 44%, and small ones in 11%, with corporate sites dominating citations and listicles as the top format. That scale is new and useful as a starting point for anyone tracking how AI surfaces brands.

It does a clean job on the observational side—direct counts on source types, sentiment flip rates, and platform differences—without overclaiming causality. The seven proposed protocols at the end are a reasonable next step for testing interventions.

The soft spot is the sampling. All brands are "tracked on Ranqo," with no visible account of how they were recruited, how tiers were assigned independently of visibility, or whether the prompts match ordinary user distributions. If participation on the platform correlates with brands already pushing for visibility, the 30-point steps become hard to interpret as pure stature effects. The abstract gives no error bars or robustness checks on prompt selection either.

This is for marketing researchers and platform teams who need empirical anchors on AI search behavior. It is not yet tight enough for strong causal claims, but the raw measurement effort is worth referee time to see if the methods can be tightened. I would send it to review rather than desk reject.

Referee Report

2 major / 0 minor

Summary. The paper presents an observational analysis of brand visibility across AI search engines (ChatGPT, Claude, Perplexity, Gemini) based on 100K+ prompt responses from 100+ brands tracked on the Ranqo platform between March and May 2026. It reports a three-tier visibility ladder on first runs (global household names at 73%, mid-market/regional at 44%, niche/small at 11%), with 78% of citations to corporate websites, listicles as the top-cited format (21%), and sentiment as an unstable signal (flipping 6.7 times more often than mention). The work positions these as a baseline for Generative Engine Optimization (GEO) and proposes seven v1.1 protocols for causal testing.

Significance. If the sampling and tiering are representative, the study supplies a valuable first large-scale empirical baseline for measuring AI brand visibility, quantifying stature-based gaps and highlighting citation patterns that could guide both research and SME strategies. The scale (100K+ responses) and forward-looking protocols add utility beyond pure description.

major comments (2)

[Abstract / Methods] Abstract and (presumed) Methods: The reported 73/44/11 visibility ladder is computed from brands 'tracked on Ranqo' with no disclosed recruitment process, independent tier-assignment criteria, or verification that tier labels are exogenous to visibility outcomes. This selection mechanism is load-bearing for the central claim of a stature-driven gap.
[Abstract / Methods] Abstract and (presumed) Methods: The 100K+ prompts lack any description of sampling frame, stratification, or validation against real user query distributions; if prompts are disproportionately brand-specific or visibility-seeking, the tier differences are confounded by query construction rather than engine behavior.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and for identifying key areas where methodological transparency can be strengthened. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract / Methods] Abstract and (presumed) Methods: The reported 73/44/11 visibility ladder is computed from brands 'tracked on Ranqo' with no disclosed recruitment process, independent tier-assignment criteria, or verification that tier labels are exogenous to visibility outcomes. This selection mechanism is load-bearing for the central claim of a stature-driven gap.

Authors: We agree that the manuscript should have provided explicit details on these points. The tier labels were assigned using observable, pre-existing brand characteristics (global recognition, market presence, and revenue scale) drawn from public sources and intended to be independent of the AI visibility measurements. However, the current text does not document the exact assignment rules or recruitment process for the Ranqo-tracked brands. In revision we will add a Methods subsection that (a) states the tier criteria with examples of the public metrics used, (b) describes the platform recruitment process to the extent it is known, and (c) discusses the assumption of exogeneity together with any limitations. We view this as a necessary clarification rather than a change to the underlying data. revision: yes
Referee: [Abstract / Methods] Abstract and (presumed) Methods: The 100K+ prompts lack any description of sampling frame, stratification, or validation against real user query distributions; if prompts are disproportionately brand-specific or visibility-seeking, the tier differences are confounded by query construction rather than engine behavior.

Authors: This concern is valid. The manuscript does not currently describe the prompt-generation process, sampling frame, or any validation against external query distributions. The prompts were constructed to be brand-relevant and representative of typical user questions, but without documented stratification or external benchmarking, confounding from query design cannot be ruled out. In the revision we will expand the Methods section to detail how prompts were generated, any steps taken to diversify them, and a limitations paragraph addressing potential selection effects. We will also note that the observed tier gaps are conditional on the prompt set used. revision: yes

Circularity Check

0 steps flagged

No circularity: purely observational measurement of visibility counts

full rationale

The paper performs direct empirical counting of brand mentions across AI responses to 100K+ prompts. No equations, fitted parameters, predictions, or derivations are present that could reduce to self-defined quantities or self-citations. The reported 73/44/11 tier ladder is computed from observed frequencies on the tracked brands; tier labels and visibility rates are independent of any internal model or ansatz. Selection-bias concerns (Ranqo sample) affect external validity but do not create circularity in the reported measurements themselves.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the representativeness of the sampled prompts and brands as a domain assumption. No free parameters or invented entities.

axioms (1)

domain assumption The selected prompts and brands represent real-world AI search behavior
The analysis relies on this to generalize the three-tier ladder.

pith-pipeline@v0.9.1-grok · 5904 in / 1310 out tokens · 31707 ms · 2026-06-26T15:43:04.124758+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

How Large Language Models Source Brand Reputation Across Languages and Markets
cs.IR 2026-06 unverdicted novelty 5.0

LLMs cite third-party domains for 85.7% of brand attributions, with Wikipedia dominant in most languages, a long-tailed domain distribution, and market-specific shifts such as YouTube and HR sites in Poland.

Reference graph

Works this paper leans on

15 extracted references · 1 linked inside Pith · cited by 1 Pith paper

[1]

Aggarwal, V

P. Aggarwal, V. Murahari, T. Rajpurohit, A. Kalyan, K. Narasimhan, and A. Deshpande. GEO: Generative Engine Optimization. KDD 2024. 2311.09735

arXiv 2024
[2]

Puerto, M

H. Puerto, M. Gubri, T. Green, S. J. Oh, and S. Yun. C-SEO Bench: Does Conversational SEO Work? NeurIPS Datasets and Benchmarks 2025. 2506.11097

arXiv 2025
[3]

Algaba, V

A. Algaba, V. Holst, F. Tori, M. Mobini, B. Verbeken, S. Wenmackers, and V. Ginis. How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices? 2504.02767 , April 2025

arXiv 2025
[4]

Kirsten, J

E. Kirsten, J. Grosse Perdekamp, M. Upadhyay, K. P. Gummadi, and M. B. Zafar. Characterizing Web Search in the Age of Generative AI. 2510.11560 , October 2025

Pith/arXiv arXiv 2025
[5]

K.-C. Yang. News Source Citing Patterns in AI Search Systems. 2507.05301 , July 2025

arXiv 2025
[6]

GEO vs AEO vs SEO: Three Measurement Views of the Same Work

Ranqo. GEO vs AEO vs SEO: Three Measurement Views of the Same Work. April 2026. https://ranqo.ai/blog/geo-vs-aeo-vs-seo

2026
[7]

What AI Platforms Really Recommend When You Ask About CRM Software

Ranqo. What AI Platforms Really Recommend When You Ask About CRM Software. February 2026. https://ranqo.ai/blog/ai-platforms-crm-recommendations-study

2026
[8]

What is Generative Engine Optimization (GEO)? The Complete 2026 Guide

Ranqo. What is Generative Engine Optimization (GEO)? The Complete 2026 Guide. April 2026. https://ranqo.ai/blog/what-is-generative-engine-optimization-geo-guide

2026
[9]

The 5 Factors That Determine Whether AI Cites Your Brand

Ranqo. The 5 Factors That Determine Whether AI Cites Your Brand. April 2026. https://ranqo.ai/blog/5-factors-ai-cites-your-brand

2026
[10]

How to Get Cited by Perplexity: The Citation-Engine Playbook

Ranqo. How to Get Cited by Perplexity: The Citation-Engine Playbook. April 2026. https://ranqo.ai/blog/how-to-get-cited-by-perplexity

2026
[11]

AI Visibility for SaaS: The Complete B2B Playbook

Ranqo. AI Visibility for SaaS: The Complete B2B Playbook. April 2026. https://ranqo.ai/blog/ai-visibility-for-saas-b2b-playbook

2026
[12]

AI Visibility for E-commerce & DTC Brands: It's Research-and-Handoff, Not Search

Ranqo. AI Visibility for E-commerce & DTC Brands: It's Research-and-Handoff, Not Search. April 2026. https://ranqo.ai/blog/ai-visibility-for-ecommerce-dtc

2026
[13]

The E-E-A-T Playbook for AI Citations: Visible Authority Beats Markup Theatre

Ranqo. The E-E-A-T Playbook for AI Citations: Visible Authority Beats Markup Theatre. May 2026. https://ranqo.ai/blog/eeat-playbook-ai-citations

2026
[14]

Schema Markup for AI Citations: A Complete Guide

Ranqo. Schema Markup for AI Citations: A Complete Guide. April 2026. https://ranqo.ai/blog/schema-markup-for-ai-citations

2026
[15]

How to Measure AI Share of Voice: The Three Decisions That Change the Number

Ranqo. How to Measure AI Share of Voice: The Three Decisions That Change the Number. June 2026. https://ranqo.ai/blog/how-to-measure-ai-share-of-voice

2026

[1] [1]

Aggarwal, V

P. Aggarwal, V. Murahari, T. Rajpurohit, A. Kalyan, K. Narasimhan, and A. Deshpande. GEO: Generative Engine Optimization. KDD 2024. 2311.09735

arXiv 2024

[2] [2]

Puerto, M

H. Puerto, M. Gubri, T. Green, S. J. Oh, and S. Yun. C-SEO Bench: Does Conversational SEO Work? NeurIPS Datasets and Benchmarks 2025. 2506.11097

arXiv 2025

[3] [3]

Algaba, V

A. Algaba, V. Holst, F. Tori, M. Mobini, B. Verbeken, S. Wenmackers, and V. Ginis. How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices? 2504.02767 , April 2025

arXiv 2025

[4] [4]

Kirsten, J

E. Kirsten, J. Grosse Perdekamp, M. Upadhyay, K. P. Gummadi, and M. B. Zafar. Characterizing Web Search in the Age of Generative AI. 2510.11560 , October 2025

Pith/arXiv arXiv 2025

[5] [5]

K.-C. Yang. News Source Citing Patterns in AI Search Systems. 2507.05301 , July 2025

arXiv 2025

[6] [6]

GEO vs AEO vs SEO: Three Measurement Views of the Same Work

Ranqo. GEO vs AEO vs SEO: Three Measurement Views of the Same Work. April 2026. https://ranqo.ai/blog/geo-vs-aeo-vs-seo

2026

[7] [7]

What AI Platforms Really Recommend When You Ask About CRM Software

Ranqo. What AI Platforms Really Recommend When You Ask About CRM Software. February 2026. https://ranqo.ai/blog/ai-platforms-crm-recommendations-study

2026

[8] [8]

What is Generative Engine Optimization (GEO)? The Complete 2026 Guide

Ranqo. What is Generative Engine Optimization (GEO)? The Complete 2026 Guide. April 2026. https://ranqo.ai/blog/what-is-generative-engine-optimization-geo-guide

2026

[9] [9]

The 5 Factors That Determine Whether AI Cites Your Brand

Ranqo. The 5 Factors That Determine Whether AI Cites Your Brand. April 2026. https://ranqo.ai/blog/5-factors-ai-cites-your-brand

2026

[10] [10]

How to Get Cited by Perplexity: The Citation-Engine Playbook

Ranqo. How to Get Cited by Perplexity: The Citation-Engine Playbook. April 2026. https://ranqo.ai/blog/how-to-get-cited-by-perplexity

2026

[11] [11]

AI Visibility for SaaS: The Complete B2B Playbook

Ranqo. AI Visibility for SaaS: The Complete B2B Playbook. April 2026. https://ranqo.ai/blog/ai-visibility-for-saas-b2b-playbook

2026

[12] [12]

AI Visibility for E-commerce & DTC Brands: It's Research-and-Handoff, Not Search

Ranqo. AI Visibility for E-commerce & DTC Brands: It's Research-and-Handoff, Not Search. April 2026. https://ranqo.ai/blog/ai-visibility-for-ecommerce-dtc

2026

[13] [13]

The E-E-A-T Playbook for AI Citations: Visible Authority Beats Markup Theatre

Ranqo. The E-E-A-T Playbook for AI Citations: Visible Authority Beats Markup Theatre. May 2026. https://ranqo.ai/blog/eeat-playbook-ai-citations

2026

[14] [14]

Schema Markup for AI Citations: A Complete Guide

Ranqo. Schema Markup for AI Citations: A Complete Guide. April 2026. https://ranqo.ai/blog/schema-markup-for-ai-citations

2026

[15] [15]

How to Measure AI Share of Voice: The Three Decisions That Change the Number

Ranqo. How to Measure AI Share of Voice: The Three Decisions That Change the Number. June 2026. https://ranqo.ai/blog/how-to-measure-ai-share-of-voice

2026