Entry #10 · Apr 8, 2026

Three citation bots. Ten articles. Here’s what we learned about an AI Site.

We now have all 3 major citation bots: Perplexity-User appeared on April 5. It’s a single data point, but it completes the set. ChatGPT-User, Claude-User, Perplexity-User: three AI platforms are now retrieving content from the same AI site during live user sessions. 9,250 citation requests across 90 days. Three months ago there were zero.

This is the tenth article in a weekly series documenting what happens when you build an AI site for a real product. The product is Genymotion, an Android emulator used by developers worldwide. The AI site is rozz.genymotion.com, built by Rozz. Every data point in this series comes from CloudFront logs on that site.

Ten articles is a good moment to step back and lay out what we actually learned. Every week along this journey we measured traffic, made changes to address issues, and learned from the results. This article summarizes the learnings along this journey. For those of you working on GEO, here are actual data points and observations on what works.

The numbers

90-day totals (Jan 8 – Apr 8, 2026)

Category Requests What it means
Citation bots9,250Real users getting answers from the AI site
Training bots11,590Content collected for model training
Search index bots1,839Building retrieval indexes
Total LLM bot requests22,679

Three citation pipelines, all active

Platform Crawl bot Citation bot Total citations First citation Current weekly
OpenAIGPTBot (2,324)ChatGPT-User (9,225)9,225Late January~1,100
AnthropicClaudeBot (3,320)Claude-User (24)24March 25~10
PerplexityPerplexityBot (721)Perplexity-User (1)1April 5Just started

ChatGPT citation rate

Using the same tool before and after, we measured 14% before the AI site and 83% after. However, we don’t fully trust these numbers since we all have different chat histories, locations, and multiple other factors that impact citation rates. We prefer to focus on the actual bot visits on the AI site, which we can measure precisely.

1. Our GEO was a feedback loop, not a launch.

As often in tech, building an AI site was an iterative cycle: observe crawler behavior in the logs, diagnose what’s blocking deeper engagement, deploy a structural fix, measure the crawl response, repeat.

Every major breakthrough in this series came from that loop:

What the logs showed What we diagnosed What we fixed What happened
28% of ChatGPT sessions dead-ended on the index page (article 6)Index was an infrastructure page with no product contextRedesigned index with product description and topic directoryPerplexityBot activated within 24 hours (article 7)
ClaudeBot checked the sitemap 15x/week without crawling (article 7–8)Monolithic sitemap gave no topic structure or freshness signalDeployed per-topic sitemapindex with per-cluster lastmodClaudeBot mass crawl within 6 hours (article 8)
93% of Q&As in one mega-topic; ChatGPT-User fetching 4 near-identical pricing pages per session; PerplexityBot stuck on the same 4 pages for 6 weeks (articles 6–7)Brand keyword “Genymotion” on 56% of pages collapsed clusteringFiltered high-prevalence brand keywords, dynamic topic countPerplexityBot went from 42 to 511 requests; GPTBot re-indexed 148 pages after topic renaming
Claude Code fetched pricing but had no implementation path (article 9)No CLI content linked from Q&A pagesBuilt runbooks, linked them from Q&A pagesClaude-User: pricing → CLI runbook in 10 seconds

None of these fixes were planned on day one. Each one came from reading the logs, seeing a problem, and fixing it. The AI site we have today is the result of dozens of iterations, not the original architecture. In fact, precisely 376 commits.

Those learnings are now part of the product so you don’t have to go through them again.

2. Structural signals trigger crawl behavior.

We consistently observed that each major crawl event was preceded by a structural change to the AI site.

Date What we changed What happened
Mar 9Index page redesigned with product description and topic directoryPerplexityBot activated within 24 hours (42 → 511 requests)
Mar 20 (15:57 UTC)Monolithic sitemap replaced with per-topic sitemapindexClaudeBot mass crawl within 6 hours (123 → 577 requests)
Apr 3Topic names changed from feature labels to user-intent phrasesGPTBot re-indexed 148 pages within 2 days

Different bots respond to different structural cues. PerplexityBot responded to a richer index page. ClaudeBot responded to a sitemapindex organized by topic. GPTBot responded to updated sitemap content after topic renaming.

The structural layer of an AI site determines whether crawlers commit to indexing your content or keep monitoring without acting.

3. Q&A pages drive the majority of citations.

66–75% of ChatGPT-User retrievals hit Q&A pages rather than the GEO pages adapted from existing website content. The Q&A pages, generated from real chatbot conversations, are what AI platforms fetch most often during live user sessions.

This corresponds to the common industry knowledge. Users ask AI platforms questions. Q&A pages are structured as questions with answers. The format matches the query pattern. Schema.org QAPage markup makes extraction clean: one question, one answer, extractable in a single fetch.

Our AI site with Genymotion is automatically fed and updated with actual questions from users interacting with our chatbot on their website. Literally thousands of questions every month that are filtered, deduped, clustered, and augmented for the AI site.

4. Content designed for AI coding tools creates a new sales channel.

Article 9 documented a session where a developer asked Claude Code about Genymotion’s pricing and 10 seconds later was reading the CLI implementation runbook—inside their terminal, without opening a browser.

This happened because the AI site includes CLI runbooks linked from Q&A pages. The runbooks aren’t pages from Genymotion’s main website. They were built specifically for developer tooling sessions: step-by-step command references that an AI coding tool can fetch and present inside the developer’s working environment.

The Q&A page answers “how much does it cost?” The linked runbook answers “how do I set it up?” An AI agent like Claude Code navigates from one to the other in one session. The developer goes from evaluation to implementation without leaving their terminal.

While this remains nascent, we believe this will be huge.

For software companies selling to developers, this is a shift. The product evaluation and the implementation are collapsing into a single AI-mediated session within a coding environment. Your content isn’t marketing fluff read by users on a webpage. It’s specifications and instructions consumed by AI tools inside workflows.

5. The structural details matter more than we expected.

Over 90 days, we made dozens of infrastructure changes. Here are some that turned out to have direct, observable effects on crawler behavior.

Sitemaps

A monolithic sitemap listing 700 URLs gave ClaudeBot no way to assess which content was fresh or which topics to prioritize. It read the sitemap 15 times per week for months without committing to a content crawl. We used that signal to replace it with per-topic child sitemaps, each with its own lastmod date and a topic name in the URL. ClaudeBot ran a 577-request content crawl within 6 hours.

Robots.txt

Our initial robots.txt had ~140 lines with 15 individual bot sections, all saying Allow: /. We found that some crawlers stop parsing after their first matching User-agent block, which meant they never reached the Sitemap: directives at the bottom. Collapsing to 12 lines with a single User-agent: * rule fixed this.

llms.txt

AI crawlers don’t reliably follow cross-domain links. When llms.txt pointed to content hosted elsewhere, crawlers didn’t follow. Inlining the key content directly in the file produced better results.

Page size

Inline CSS was adding ~9KB per page that provided no value to AI agents consuming the content. Externalizing stylesheets moved actual content earlier in the page, which matters for agents that process HTML with token budgets.

Featured content ranking

The AI site’s index page originally showed featured Q&As in arbitrary database order. Reranking by actual retrieval count (which Q&As ChatGPT-User fetches most often) aligned the index with what users actually ask about.

Topic taxonomy

The brand keyword “Genymotion” appeared on 56% of all pages, causing the clustering algorithm to put 93% of Q&As into a single mega-topic. Filtering high-prevalence brand keywords and switching to a dynamic topic count broke this into 25 specific topics. When we later renamed those topics from feature labels (“Android Testing”) to user-intent phrases (“Test Android Apps at Scale”), GPTBot re-indexed 148 pages within two days.

None of these changes involved writing new content. They were structural fixes: how the content is organized, presented, and discovered. Each one had a measurable effect on how crawlers behaved.

6. AI coding tools are a distinct retrieval channel.

Of the 24 Claude-User requests we recorded, 22 came from Claude Code (user-agent: claude-code/2.1.83-84). Two came from claude.ai web search (Claude-User/1.0). These are distinct channels with distinct behavior.

Claude Code sessions are developer-specific: pricing questions followed by CLI runbooks, 70-minute product evaluations, index-to-pricing-to-cloud-marketplace navigation. We’re focusing on increasing these sessions that are particularly valuable for software companies: providing a developer a seamless, high quality experience right in their terminal can considerably influence a purchase decision.

7. Human visitors can be routed back to the source with attribution.

The AI site serves structured content to bots. But humans occasionally land on it too by following a link from an AI response or clicking through from a search result. In April, we deployed a routing system that redirects human visitors to the corresponding page on Genymotion’s main website, with UTM source attribution indicating which AI platform sent them (chatgpt, claude, perplexity).

Bots continue to receive the GEO-optimized content. Humans get sent to the product’s actual website where they can sign up, purchase, or contact sales. The AI site becomes a measurable attribution channel: you can track how many website visitors arrived via AI-mediated discovery.

Where each platform stands

Platform Crawl status Citation status Weekly citation volume Trend
ChatGPTMature83% citation rate~1,100/weekSteady state
ClaudeIndexed, monitoringLive retrieval (Claude Code)~10/weekEarly, growing
PerplexityIndexed, maintenanceFirst retrieval observed1 requestJust started
GeminiNot crawling AI siteAnswers from training data + Google SearchSeparate strategy

What comes next

Three things we’re watching:

Perplexity-User growth. PerplexityBot indexed 511 pages in March. Perplexity-User just appeared. If the ChatGPT pattern holds (deep indexing → exponential citation growth over 3–4 weeks), Perplexity citation volume should increase through April.

Claude-User trajectory. 24 requests in two weeks. ChatGPT-User went from 42 in January to 1,200/week by March. Will Claude-User follow the same curve, or does the Claude Code developer audience stay smaller but higher intent?

Human redirect attribution. The browser redirect system went live in April. The first data on which AI platforms drive actual website visits—and which pages they land on—will tell us whether AI-mediated discovery converts to product signups.

Three AI platforms went from zero to active citation on the same content, the same markup, the same topic taxonomy. We’re iterating. Stay tuned.

Get this for your site

These are the learnings from one client, over 90 days, across dozens of infrastructure iterations. Three AI platforms went from zero to active citation on the same content, the same markup, the same topic taxonomy. Rozz builds this for every client.

Structured Q&A pages from your chatbot. Per-topic sitemaps. Schema.org markup. CLI runbooks for developer tools. Featured content ranked by actual retrieval data. The infrastructure that turns AI crawlers into citation channels.

$997/month | ChatGPT at 83%. Three platforms citing. The data is in the articles.

Book a call  |  See how it works  |  rozz@rozz.site

Latest Entry

Data source: CloudFront access logs for rozz.genymotion.com, January 8 – April 8, 2026 (90 days). Bot classification based on User-Agent strings. Cumulative totals across all 10 weekly entries in The Crawler Logs series.