Hallucinated URLs: Frustrating AI bug, or free content gap research?
Hallucinated URLs are a nuisance. They point users to your site and dump them on a 404.
But they aren’t always useless. Enough 404s directly from LLMs can signal user intent. That’s keyword research hiding in plain sight, a trend discovered by our Head of SEO, Anastasia Kotsiubynska.
Below are the potential solutions she spotted while analyzing traffic from ChatGPT:
- ChatGPT can drive traffic to fake URLs that look real.
- Some of those pages get repeated visits or even backlinks.
- You can either redirect them to relevant pages or treat them as ideas for new content.
Her second point is the most revealing. If some fake pages are getting repeated visits or backlinks, they aren’t just noise anymore, but a map for what users expect to find.
Why AI invents plausible but hallucinated URLs
OpenAI confirmed this with a recent research paper that explained why models hallucinate.
The short take? They’re incentivized to make educated guesses.
Standard training and evaluation procedures reward guessing over admitting uncertainty. As OpenAI puts it: “Strategically guessing when uncertain improves accuracy but increases errors and hallucinations.”
Think of it like a multiple choice test. Guessing can boost your score, while leaving blanks guarantees zero. That’s why ‘careful’ models look worse in traditional benchmarks, even if they make fewer confident mistakes.
OpenAI’s own scoreboard shows the trade-off. GPT-5-thinking-mini abstains over half the time, but its error rate is dramatically lower than the older o4-mini’s. Notice how its accuracy score looks worse:
But that’s the point. Smarter models hold back more, but when they do make something up, it tends to look plausible. So when ChatGPT invents a URL, it’s probably not random but a content gap opportunity, especially if it’s getting repeat visits and backlinks.
How big of an issue are hallucinated links exactly?
Honestly? Pretty small, but they’re still worth paying attention to.
Ahrefs found AI models generate broken links at a much higher rate than Google:
Source | % of Clicked URLs → 404 | How It Compares |
ChatGPT | 1.01% | Higher than others, but still tiny. |
Claude | 0.58% | Half a percent. |
Copilot | 0.34% | About 1 in 300 clicks. |
Perplexity | 0.31% | Similar to Copilot. |
Gemini | 0.21% | One-fifth of a percent. |
Mistral | 0.12% | Negligible. |
Google (baseline) | 0.15% | Basically the same as AI, just lower than ChatGPT. |
ChatGPT is the biggest offender. That 1% 404 rate looks tiny until you scale it across OpenAI’s 180M users. That’s potentially millions of phantom page visits every month. To be clear, that’s spread across the web, not concentrated on any one site.
But even if only 10% of those links are plausible, that’s a non-trivial demand signal.
ChatGPT also got a major update to its search functionality. It’s more factual, detects shopping intent better, and formats answers for quicker understanding without sacrificing depth or quality.
That could mean fewer hallucinated URLs over time, but when they do surface, they’re even more valuable.
The bottom line? Think of this as early prep for where search is headed. Sure, all models together average a miniscule 0.43% 404 rate, and that will probably shrink as they get better. But the fewer hallucinations there are, the more those guesses matter. Pay attention to them. Tracking model quirks now is how you stay ahead later.
That’s exactly what our head of SEO is doing.
Turning hallucinated URLs into a playbook
On GA4, Anastasia found 70 fake URLs over three months.
Twenty had backlinks. Most got 1–3 visits, but a handful drew 20+ sessions. Her takeaway was pragmatic: redirect anything with repeat traffic, because that’s user intent showing through.
Here’s her playbook:
- 1–2 sessions, no backlinks → Ignore (probably noise, but not always).
- 3+ sessions OR repeat referrers → Redirect to a relevant live page.
- Backlinks present → Redirect + monitor for repeat links.
- Clear content gap → Create a lightweight page, track GSC impressions, and test demand.
Don’t bank on our data alone. ChatGPT’s tiny 404 rate means the difference between one visit and three visits isn’t statistically solid. A one-off hit might be noise, but it might not. What you uncover may look very different from what we did, both in volume and in value.
At the very least, use low-frequency hallucinations as page ideas. Don’t dismiss them outright unless they’re completely irrelevant.
The better approach is to treat every interesting hallucination as a hypothesis. Log it, group it with similar URLs, and watch for repetition or clustering around certain structures.
If the pattern repeats, that’s intent worth chasing. If it doesn’t, archive and move on.
So is it really worth it?
The numbers are small, so the payoff only makes sense if the fixes are low-effort:
- Redirects
- Quick test pages
- Routine monitoring in GA4 or GSC
But referral traffic (from hallucinated or real links) from AI is growing. If fake links keep showing up, if they earn backlinks, or if they generate repeat sessions, that’s worth a response. Treat them as an early warning system for gaps in your content.
So, not a scalable new traffic channel, but a side stream that can feed into your existing SEO work.