How to track whether AI engines are citing your page
Updated June 12, 2026 · 9 min read
To track whether AI engines cite your page, combine four free methods: run a fixed list of prompts in ChatGPT, Perplexity, and Google AI Mode and log when your URL appears; check analytics for referral traffic from those domains; scan server logs for AI crawler visits; and confirm those bots can actually fetch your page.
Why "track AI citations" is different from tracking rankings
Traditional rank tracking answers one question: where does my page sit on a results page for a keyword? AI citation tracking answers a harder one: when an answer engine writes a paragraph in response to a question, does it quote or link my page as a source? The two are related but not the same. You can rank on page one of Google and still never appear in an AI Overview, and you can be cited in Perplexity for a question you don't rank for at all.
There is no single dashboard that authoritatively reports every AI citation, because the major engines do not publish citation data the way Google Search Console publishes clicks and impressions. Answers are also personalized and probabilistic — the same prompt can return different sources on different days, for different users, in different regions. That means tracking is about building a repeatable sampling process, not pulling one perfect number.
The good news: you do not need a paid monitoring tool to get a useful signal. Four free methods, used together, tell you whether your page is being seen, fetched, and quoted. Each method covers a blind spot in the others.
Method 1: Build a prompt panel and test it on a schedule
The most direct way to know if an engine cites you is to ask it the questions your page is meant to answer, then check whether your URL shows up as a linked source. This is sometimes called prompt testing or building a prompt panel.
Start by writing 10–20 prompts a real prospect would type — not your brand name, but the problems and questions your page targets. For a page about, say, returns policies, prompts might include "how long do I have to return an online order" or "best practices for an ecommerce return policy." Phrase them as natural questions, since that is how people use answer engines.
Run each prompt in the engines that matter to you. As of 2026 the main ones to check are ChatGPT (which surfaces web links with browsing/search enabled), Perplexity (which lists numbered sources under most answers), and Google's AI Overviews / AI Mode (which show linked source cards). For each run, record the date, the engine, the exact prompt, whether your domain appeared, and which competitor domains were cited instead.
- •Keep prompts fixed so results are comparable week over week — changing the wording resets your baseline
- •Test in an incognito/logged-out window where possible to reduce personalization skew
- •Run the panel on a fixed cadence (for example, the first business day of each month) rather than ad hoc
- •Save the full source list, not just yes/no — knowing who gets cited instead tells you what to fix
- •Note that answers vary by run; treat a single "not cited" result as a sample, not a verdict
Method 2: Watch your analytics for AI referral traffic
When someone clicks a source link inside an AI answer, that visit usually arrives at your site with a referrer from the engine's domain. Your existing analytics already captures this — you just have to know where to look.
In your analytics tool, open the referral or traffic-source report and filter for the AI engines' domains. Common referrers to look for include chatgpt.com and chat.openai.com, perplexity.ai, gemini.google.com, and copilot.microsoft.com. Some Google AI Overview clicks are attributed to ordinary google.com organic traffic rather than a distinct source, so this method undercounts Google specifically — which is exactly why you also run the prompt panel.
Referral traffic is a lagging, behavioral signal: it confirms not only that you were cited but that the citation was visible and compelling enough to earn a click. Track it as a trend. A page that starts receiving steady visits from perplexity.ai is being cited and read, even if you never catch it in a manual prompt test.
- •Build a saved segment or filter for AI-engine referrers so you can check it in seconds
- •Look at landing pages within that segment — they tell you which of your URLs are being cited
- •Expect low absolute numbers today; direction over months matters more than raw volume
- •Cross-reference spikes with content you recently published or updated
Method 3: Read your server logs for AI crawler hits
Before an engine can cite your page, its crawler usually has to fetch it. Your raw server access logs (or your CDN's log export) record every one of these requests by user-agent, giving you a leading signal that runs ahead of citations and referral clicks.
Several AI companies publish the user-agent strings their bots use. Well-documented examples include OpenAI's GPTBot (training/crawl) and OAI-SearchBot (its search crawler), ChatGPT-User (the agent used when ChatGPT fetches a page live in a session), PerplexityBot, and Google-Extended (a control token Google honors for AI training, though it is a policy signal rather than a separate crawler). Grep your logs for these strings to see which AI bots are visiting, how often, and which URLs they request.
If you see GPTBot or PerplexityBot fetching a page, you know it is at least discoverable to that engine. If a page you care about never appears in crawler logs, that is a concrete, fixable problem — the engine cannot cite what it has not fetched.
- •Filter access logs by user-agent for GPTBot, OAI-SearchBot, ChatGPT-User, PerplexityBot, and similar strings
- •Check which specific URLs get crawled — gaps reveal pages AI engines aren't reaching
- •Confirm your robots.txt isn't blocking the bots you want (an accidental disallow is common)
- •No log access? Many hosts and CDNs offer a log export or a bot-traffic report you can filter
Method 4: Confirm the bot can actually render and quote your page
Crawl access is necessary but not sufficient. An engine also has to be able to read the actual answer on your page. If your key content only appears after client-side JavaScript runs, or it is buried in an image, a slow-loading widget, or a region the crawler skips, you can be crawled and still never be the source of a quotable sentence.
You can sanity-check this yourself for free. Fetch your page the way a simple bot would — view the raw HTML (in most browsers, "View Source" rather than the rendered DOM) and confirm your headline answer, key facts, and structured data are present in that raw markup. If the answer is missing from the source, a citation-hungry engine may never find the exact phrasing it needs to quote.
This is the most common reason a technically "indexed" page never gets cited: the quotable answer is not in a clean, server-rendered, plainly worded block near the top. Pages that lead with a direct, self-contained answer — the same structure that earns featured snippets — give engines an easy passage to lift.
- •View raw page source and confirm your core answer text is present without JavaScript
- •Make sure one self-contained paragraph fully answers the page's main question near the top
- •Check that any FAQ, definitions, or stats you want quoted are in text, not images
- •A free AI-readiness grader can flag render and structure gaps faster than manual inspection
Turn the four signals into one simple tracking sheet
Individually each method has a blind spot. The prompt panel is manual and varies run to run. Referral traffic undercounts Google. Crawler logs prove fetching, not quoting. Render checks prove quotability, not actual citation. Together they form a funnel you can monitor: can the bot fetch the page (logs) → can it read the answer (render check) → does it cite the page (prompt panel) → did a human click through (referral traffic).
Keep it in a single spreadsheet with one row per target page and columns for each signal, refreshed on a fixed monthly cadence. Over a few months the pattern becomes obvious: a page that is crawled, renders cleanly, and starts showing up in prompt tests and referral data is winning; a page stuck at "crawled but never cited" usually has a structure or clarity problem you can fix.
This stays free because every input is something you already own — your analytics, your server logs, your browser, and a few minutes in each answer engine. The only ongoing cost is the discipline to run the same checks the same way each month so the trend is trustworthy.
What to fix when a page is crawled but never cited
The most actionable state in your tracking sheet is "crawled, renders, but never cited." The engine can reach and read the page, yet keeps quoting someone else. That almost always points to the answer itself rather than to a technical defect.
Common fixes: lead with one self-contained paragraph that fully answers the page's core question in plain language; break the page into clear question-style headings an engine can map to specific sub-questions; add genuine specifics (definitions, steps, comparisons) rather than vague marketing copy; and add structured data such as FAQ or article markup so the meaning is machine-explicit. Then re-run your prompt panel the following month and watch whether the citation status changes.
Treat each cited competitor as a free brief. If Perplexity keeps citing a rival page for your target prompt, open it and note what makes it quotable — a crisp definition, a numbered list, a direct answer up top — and make your page match or beat it on clarity.
AI SEO Page Grader (AEO / GEO)
Grade your page's AI-search citation readiness — get your Revenue Grade and the specific fixes in seconds.
Frequently asked questions
- Is there a free tool that tells me exactly when AI cites my page?
- No single free tool reports every AI citation, because ChatGPT, Perplexity, and Google AI Overviews don't publish citation data like Search Console publishes clicks. The reliable free approach is to combine manual prompt testing, AI-referral traffic in your analytics, and server-log checks for AI crawlers into one monthly tracking sheet.
- How do I check AI referral traffic in my analytics?
- Open your traffic-source or referral report and filter for AI-engine domains such as chatgpt.com, chat.openai.com, perplexity.ai, gemini.google.com, and copilot.microsoft.com. Save it as a segment and review which landing pages get those visits. Note that many Google AI Overview clicks are attributed to regular google.com traffic, so this method undercounts Google.
- How can I tell if AI crawlers are visiting my site?
- Search your server access logs or CDN log export for AI bot user-agents like GPTBot, OAI-SearchBot, ChatGPT-User, and PerplexityBot. Seeing these strings confirms the engine is fetching specific URLs. If a page you care about never appears, check that robots.txt isn't blocking the crawler — an engine can't cite a page it never fetched.
- Why does the same prompt cite my page one day and not the next?
- AI answers are probabilistic and often personalized by account, location, and time, so sources vary between runs. That's why you treat a single result as a sample, not a verdict, and run a fixed prompt panel on a schedule. The trend across repeated runs is far more reliable than any one answer.
- My page is crawled but never cited. What's wrong?
- Usually the quotable answer isn't easy to lift. Check that your core answer appears in raw HTML (not only after JavaScript), lead with one self-contained paragraph answering the main question, use clear question-style headings, and add FAQ or article schema. Then re-run your prompt panel the next month to see if citation status changes.
- How often should I run these checks?
- A fixed monthly cadence works for most sites: run your prompt panel, review AI-referral traffic, and scan crawler logs on the same day each month so results stay comparable. Run a render and structure check whenever you publish or substantially update a page you want cited, since that's the input you fully control.