Sign in Run Free Scan
The Taxonomy

93 measurable factors. Nine categories.

The 93-factor AEO taxonomy is AIVZ's comprehensive measurement framework for AI visibility. No competitor has published or operationalized anything comparable. Every factor carries a confidence label. Every score is reproducible from the same factor evidence.

9
Categories
93
Measurable factors
Each factor: deterministic where possible, calibrated by confidence label, validated against real citation outcomes before shipping.
The Count Is Not Arbitrary

Why 93?

The 93-factor taxonomy didn't start at 93. It started at 27 — the original "Core" factor set ported from AIVZ's WordPress plugin foundation in early 2025. That Core set covered the highest-impact AEO signals available at the time, and it was sufficient to produce a meaningful initial score.

The taxonomy expanded through five implementation phases as the AEO discipline matured. Each phase added factors only after they cleared three tests.

The current count is 93. It will be more next year and the year after — AEO is a young discipline; the measurable surface keeps expanding. The number is not the point. The point is that every factor we measure is calibrated, documented, validated, and inspectable.

Three tests for every factor
01 · Measurable

Detectable through deterministic rules, validated LLM judgments, or platform APIs we have stable access to.

02 · Calibrated

Carries a confidence label that honestly reflects how proven its citation impact is.

03 · Validated

Correlates with observed AI citation outcomes against real platforms before shipping.

Organized By Dimension

What the categories cover.

Categories map onto the three-layer AI Visibility Stack but are not isomorphic with it — some categories span multiple layers.

CATEGORY 01

Crawlability & Access

Layer 1 — Access

What it measures: Whether AI bots can physically reach and render your content. The foundational category — every other category presupposes this one works.

Example signals
  • robots.txt AI bot permissions (GPTBot, ClaudeBot, PerplexityBot, GoogleOther, Bingbot)
  • WAF/CDN bot blocking configuration
  • Server-side rendering vs JavaScript-only rendering
  • Time-to-first-byte (TTFB) and full-page-load timing
  • XML sitemap accuracy and freshness
  • llms.txt declaration
Common failures

Unintentional robots.txt blocks that pre-date AI crawler awareness; aggressive WAF rules that challenge AI user agents; JS-rendered content with no SSR fallback.

Confidence label spread: Mostly Established (web-standard signals); some Emerging (specifically llms.txt, AI-bot-specific user agent identification).
CATEGORY 02

Structured Data & Machine Readability

Layer 2 — Understanding

What it measures: Whether AI systems can parse the structured metadata you publish — JSON-LD, Schema.org types, machine-readable feeds, semantic HTML.

Example signals
  • Presence of JSON-LD structured data
  • Organization schema and sameAs to authoritative profiles
  • Person schema for authors
  • Article, FAQPage, HowTo, Product, and other Schema.org type usage
  • Schema validation (no errors, no warnings)
  • Schema graph completeness — mainEntity, nested entity references, cross-page consistency
  • RSS/Atom feed availability
  • Speakable schema for voice extraction
Common failures

No JSON-LD at all; JSON-LD with errors that pass display tests but fail semantic parsing; Organization schema present but no sameAs linking; schema-content drift where the structured data describes content that isn't actually on the page.

Confidence label spread: Mostly Established (Schema.org and JSON-LD are formally documented); Speakable schema is Emerging.
CATEGORY 03

Content Structure & Extractability

Layer 3 — Extractability

What it measures: Whether AI systems can extract clean, citable answer blocks from your prose. The "structural quality of writing" category — most factors here are about how content is organized, not what it says.

Example signals
  • Front-loaded direct answers
  • Concise answer blocks (40–60 word target)
  • Question-based headings
  • Heading hierarchy correctness
  • FAQ structure (HTML + schema)
  • Definition and summary density
  • Bullet and numbered lists
  • HTML tables for comparison content
  • Statistics with sources
  • Citation formatting quality
Common failures

Answers buried in paragraph 4 or later; headings that are decorative rather than question-aligned; comparison content in prose rather than tables; statistics without source attribution; no FAQ structure on pages where Q&A format would extract better.

Confidence label spread: Strongly Inferred for most factors (extensive industry observation; not always formally documented). Some are Established via Google's structured-data documentation.
CATEGORY 04

Entity & Knowledge Graph Signals

Layer 2 — Understanding

What it measures: Whether AI systems recognize the entities (people, organizations, places, products, concepts) on your pages — and whether those entities are grounded in authoritative knowledge graphs.

Example signals
  • Entity density (named entities per page)
  • Wikidata entity presence for organizations and key people
  • Knowledge graph alignment
  • Disambiguation signals (clear identification of which "John Smith" the page references)
  • Cross-page entity consistency (same person, same name spelling, same identity grounding)
  • Entity relationship mapping (this person is CEO of this company, headquartered in this city)
Common failures

No Wikidata entity for the company; inconsistent entity naming across the site; ambiguous entity references that AI can't disambiguate confidently; missing cross-page entity links.

Confidence label spread: Established for major-platform entity recognition; Strongly Inferred for cross-platform consistency benefits; some factors are still Emerging as knowledge-graph practices evolve.
CATEGORY 05

E-E-A-T & Trust Signals

Layer 2 Layer 3

What it measures: Whether AI systems trust the source — author credentials, publication history, freshness, original research, factual accuracy, YMYL handling.

Example signals
  • Named author presence
  • Author bio and credentials
  • Author publication history
  • Editor and reviewer attribution
  • Content freshness (last-updated dates, fresh data)
  • Original research presence
  • YMYL category handling (medical, legal, financial — elevated trust thresholds apply)
  • Factual accuracy signals
  • Editorial and corrections policy presence
Common failures

Anonymous content (no named author); author bios that don't establish credentials; stale content with last-updated dates from years ago; YMYL content without reviewer credentials; no editorial policy or corrections policy.

Confidence label spread: Established for Google-derived AI surfaces (Google AI Overviews, Gemini); Strongly Inferred for cross-platform impact; YMYL elevation is Established.
CATEGORY 06

Off-Site Authority

Cross-layer

What it measures: Whether external sources validate your authority — citations, references, mentions, links from authoritative domains. The AuthorityGraph engine surface.

Example signals
  • Citation count from authoritative sources
  • Mention frequency in domain-relevant publications
  • Backlink quality from high-authority domains
  • Author personal-brand authority (separate from organizational authority)
  • Topical authority concentration
  • Cross-mention patterns across credible sources
  • Reputation signal aggregation
Common failures

Strong site, weak external authority — domain hasn't been cited or referenced enough by authoritative sources; author authority disconnected from organizational authority; over-concentration in low-authority sources.

Confidence label spread: Mix of Established (link-based signals are well-studied) and Strongly Inferred (newer authority-aggregation signals not yet formally documented by AI providers).
CATEGORY 07

Semantic Matching

Layer 3 — Extractability

What it measures: Whether your content matches the way users actually ask questions — conversational alignment, topical depth, intent matching, query-to-answer correspondence.

Example signals
  • Conversational phrasing alignment with natural-language query patterns
  • Topical depth on pages claiming topical authority
  • Intent matching (informational, transactional, navigational, investigational)
  • Long-tail query coverage
  • Answer-format correspondence (a "how" question gets a procedural answer, not a definitional answer)
Common failures

Content written in marketing language that doesn't match how users phrase their questions; thin coverage on pages claiming topical depth; answer format mismatched to the question.

Confidence label spread: Mostly Strongly Inferred. Semantic matching factors are observable but rarely formally documented by AI providers.
CATEGORY 08

Platform-Specific Signals

Cross-layer

What it measures: Per-platform readiness for the major AI answer surfaces — ChatGPT, Google AI Overviews, Perplexity, Gemini, Microsoft Copilot, voice assistants. Captures readiness that doesn't generalize across platforms.

Example signals
  • Per-platform readiness scores (one per major platform)
  • Platform-specific freshness weighting
  • Platform-specific citation style preferences
  • Voice-readiness (speakable schema, conversational answer length, audio-extractable content)
  • IndexNow support (Microsoft / Bing surfaces)
  • Platform-specific structured-data preferences
  • NavBoost / user-satisfaction signal correlations
Common failures

Composite score in the AI Extractable tier but Invisible to AI on Voice (no speakable schema); strong on Google-derived surfaces but weak on Perplexity (live-crawl issues); IndexNow not adopted, slowing Bing/Copilot indexing.

Confidence label spread: Mix of Strongly Inferred (most platform-specific behaviors observable but not documented) and Emerging / Experimental (newer signals like NavBoost-class correlations and platform-specific freshness models).
CATEGORY 09

Observability & Diagnostics

Cross-layer (operationalizes the framework)

What it measures: Whether you can track AEO outcomes over time — AI crawler analytics, citation simulation, score history, alert systems, change detection. This category isn't about being visible; it's about measuring visibility over time.

Example signals
  • AI crawler log analytics (which AI bots visit, which pages they fetch, how often)
  • Citation simulation (synthetic queries against AI platforms, attribution tracking)
  • Score history and trend analysis
  • Change detection (factors regressing, factors improving, schema drift over time)
  • Alert systems (drops in critical factors, new bot blocking events)
  • Dashboard reporting infrastructure
Common failures

No crawler analytics — you can't tell whether AI bots are even reaching your content; no citation tracking — you can't tell whether the score improvements are correlating with citation improvements; no historical record.

Confidence label spread: Established for the measurement infrastructure itself; Strongly Inferred for the operational impact of having or lacking this category in place.
Honesty About What We Know

Every factor carries a confidence label.

AEO is a young discipline. We don't pretend everything is equally proven, and we don't bury uncertainty in marketing copy.

LabelWhat it meansExample factors
Established Well-supported by web standards, platform documentation, or broadly accepted technical practice. JSON-LD presence, robots.txt configuration, HTTPS, mobile usability, Schema.org core types
Strongly Inferred Not always formally documented, but strongly supported by research or repeated industry observation. Front-loaded answers, concise answer blocks, citation-formatting quality, entity density
Indirect / Correlated Likely influences AI visibility indirectly through search prominence, authority, or trust. Off-site authority signals, social presence, brand mention frequency, backlink profile
Emerging / Experimental New or evolving factors not yet stable or universally adopted. Speakable schema, IndexNow support, platform-specific freshness weighting, NavBoost-class signals

Confidence labels move both ways. A factor classified as Emerging can be promoted to Strongly Inferred as evidence accumulates. A factor classified as Strongly Inferred can be demoted to Indirect / Correlated if the evidence base weakens. The label is a current assessment, not a permanent assignment.

Public changelog
How We Built This

Five phases of implementation.

The 93-factor taxonomy didn't ship at 93. It expanded through five phases as we validated factors against real citation outcomes.

Phase Factors Added Cumulative Total Focus
Phase 1 · Foundation 27 27 Highest-impact factors ported from the AIVZ WordPress plugin — the first measurable AEO signal set with validated citation correlation.
Phase 2 · On-Page Expansion +28 55 Comprehensive on-page coverage: schema completeness, content structure, entity grounding, internal authority signals.
Phase 3 · Authority + Platform +15 70 Off-site authority integration; per-platform readiness scoring for the six major AI surfaces.
Phase 4 · LLM + Advanced NLP +15 85 LLM-judged factors — semantic matching, topical depth, conversational alignment — with the 0.18 weight cap on LLM-derived sub-scores.
Phase 5 · Remaining Experimental +8 93 Emerging factors: IndexNow, machine-readable feed architecture, cross-page entity consistency, YMYL sensitivity, platform-specific freshness, NavBoost-class signals.

Phase 5 includes factors classified as Emerging / Experimental. We measure them and surface them with the appropriate confidence label rather than excluding them — the alternative (waiting until they're "proven") means missing the leading edge of where AEO is heading.

Future phases will likely focus on multimodal content readiness as AI answer surfaces increasingly include multimodal output and citation.

What Each Tier Sees

Factor coverage scales with subscription tier.

The taxonomy is operationalized across the AIVZ subscription tiers. Free-tier scans cover the most-impactful subset. Paid tiers extend coverage in proportion to subscription scope.

TierFactor Coverage
FreeHighest-impact subset — enough to produce a meaningful score and the top three fix recommendations. Sufficient for ad-hoc scans.
ProComprehensive on-page coverage — all factors that can be evaluated from a single-page or single-domain crawl without external authority data.
AgencyOn-page coverage plus off-site authority signals plus per-platform readiness. The full set needed for client-facing portfolio work.
EnterpriseComplete 93-factor coverage including emerging factors and custom-extension factors negotiated per engagement.

Coverage at every tier is fully documented in the product — every score is paired with the factors that produced it. There's no "premium-tier-only" hidden methodology.

Tier scope, capabilities, and pricing
The Disproportionate-Impact Subset

Eleven factors do most of the work.

Within the 93-factor taxonomy, eleven specific factors carry disproportionate impact on whether AI systems select a page for citation. The Core 11 spans Categories 2, 3, 4, and 5.

01
JSON-LD Structured Data
02
Front-Loaded Answers
03
Concise Answer Blocks
04
Heading Hierarchy
05
Definition & Summary Density
06
Statistics with Sources
07
Bullet & Numbered Lists
08
HTML Tables for Comparisons
09
Citation Formatting Quality
10
Entity Density
11
Named Author Presence

The Core 11 is the highest-leverage starting point. The other 82 factors matter — that's why they're in the taxonomy — but the Core 11 is where the leverage concentrates.

Per-factor depth on the Core 11
Your Next Step

Three paths from here.

See your factor breakdown

Get the composite score, layer breakdown, and category-by-category detail of which factors are passing and failing on your real content.

Run a free scan

The Citation Core 11

Per-factor depth on the eleven highest-leverage factors. The starting point if you're prioritizing AEO work.

Read the Citation Core 11

The dependency model

The three-layer Stack — Access, Understanding, Extractability — and how factors aggregate into the composite score.

Read the AI Visibility Stack

Or — return to the methodology hub: Read the AEO methodology →

Ready When You Are

See which factors you're passing.

Run a free scan. Get your AI Visibility Score and per-category factor breakdown across the highest-impact subset — in under 60 seconds.

Enter your domain
Free No signup Results in 60 seconds
Methodology · 93-Factor Taxonomy