Skip to content

LLM Hallucinations in the Wild: Large-Scale Evidence from Non-Existent Citations

Source: arXiv:2605.07723
Authors: (CC BY 4.0 licensed)
Subjects: Digital Libraries (cs.DL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Physics and Society (physics.soc-ph)


TL;DR

This paper systematically studies LLM citation hallucinations — instances where language models fabricate references to academic papers that do not exist. By examining large-scale real-world outputs, the authors document the prevalence, patterns, and distribution of non-existent citations across domains. The work bridges AI reliability research, bibliometrics, and computational social science.


Background

Citation hallucination is a well-known failure mode of LLMs: when asked to provide supporting references, models may generate plausible-looking but entirely fabricated citations — complete with author names, journal titles, and DOIs that point nowhere. This is a subset of the broader hallucination problem but is particularly problematic for academic and professional use where verifiable sources are essential.

Research Questions

The study investigates:

  1. How frequently do LLMs generate non-existent citations in real-world use?
  2. What patterns do these fabricated citations follow?
  3. Are certain domains or citation formats more prone to hallucination?
  4. Can the scale of the problem be systematically measured?

Significance

This work sits at the intersection of: - AI safety/reliability: Quantifying a concrete failure mode of deployed LLMs - Digital libraries: Understanding the impact of AI-generated content on the scholarly record - Computational social science: Using large-scale text analysis to study model behaviour - Science of science: The potential contamination of citation networks by AI-generated fabrications


Key Takeaways

  1. Citation hallucination is measurable at scale — this paper provides one of the largest empirical datasets on the phenomenon
  2. The problem spans domains — not limited to niche or technical fields
  3. Implications for scholarly integrity — as LLM-generated content proliferates, fabricated citations risk contaminating the academic record
  4. Open data and methods (CC BY 4.0) enable further research and mitigation efforts