LLM Hallucinations in the Wild: Large-Scale Evidence from Non-Existent Citations¶

Source: arXiv:2605.07723
Authors: (CC BY 4.0 licensed)
Subjects: Digital Libraries (cs.DL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Physics and Society (physics.soc-ph)

TL;DR¶

This paper systematically studies LLM citation hallucinations — instances where language models fabricate references to academic papers that do not exist. By examining large-scale real-world outputs, the authors document the prevalence, patterns, and distribution of non-existent citations across domains. The work bridges AI reliability research, bibliometrics, and computational social science.

Background¶

Citation hallucination is a well-known failure mode of LLMs: when asked to provide supporting references, models may generate plausible-looking but entirely fabricated citations — complete with author names, journal titles, and DOIs that point nowhere. This is a subset of the broader hallucination problem but is particularly problematic for academic and professional use where verifiable sources are essential.

Research Questions¶

The study investigates:

How frequently do LLMs generate non-existent citations in real-world use?
What patterns do these fabricated citations follow?
Are certain domains or citation formats more prone to hallucination?
Can the scale of the problem be systematically measured?

Significance¶

This work sits at the intersection of: - AI safety/reliability: Quantifying a concrete failure mode of deployed LLMs - Digital libraries: Understanding the impact of AI-generated content on the scholarly record - Computational social science: Using large-scale text analysis to study model behaviour - Science of science: The potential contamination of citation networks by AI-generated fabrications

Key Takeaways¶

Citation hallucination is measurable at scale — this paper provides one of the largest empirical datasets on the phenomenon
The problem spans domains — not limited to niche or technical fields
Implications for scholarly integrity — as LLM-generated content proliferates, fabricated citations risk contaminating the academic record
Open data and methods (CC BY 4.0) enable further research and mitigation efforts