AI conference’s papers contaminated by AI hallucinations

Published: 2026-01-22T21:52:37+00:00

Summary

GPTZero’s Hallucination Check flagged 100 fabricated citations across 51 papers accepted to NeurIPS. The company says this follows earlier discoveries of fabricated references in ICLR submissions and signals a wider problem: rapid adoption of generative AI has coincided with rising submission volumes and more substantive errors in machine-learning papers. The issues reported include invented authors, made-up sources and passages likely produced by AI.

Key Points

GPTZero detected 100 hallucinated citations in 51 NeurIPS papers.
Earlier checks found 50 fabricated citations in ICLR submissions.
NeurIPS submissions rose more than 220% from 2020 to 2025, stretching reviewer capacity.
A December 2025 preprint found objective mistakes per paper increased (NeurIPS average rose from 3.8 in 2021 to 5.9 in 2025).
Hallucinations include invented citations, fictitious authors and AI-authored text.
Anti-forensic tools (for example Humanizer) can make AI writing harder to detect.
Publishers and organisations like STM are urging updated policies to protect research integrity.

Content summary

GPTZero published results showing dozens of fabricated citations in papers accepted to a major ML conference. The story links this to a surge in submissions and the widespread use of generative AI tools, which can invent plausible but non-existent references. Researchers cite a preprint analysis showing a steady increase in objective mistakes across top venues. The piece also notes parallels in legal filings, highlights anti-detection techniques, and points to industry efforts—including reports from STM and commentary from Retraction Watch—urging publishers to adapt peer review and editorial policies. NeurIPS had not responded at the time of reporting.

Context and relevance

This is important for anyone who reads, cites or builds on ML research. The combination of skyrocketing submission volumes and easy access to generative AI risks degrading the reliability of the scholarly record, harming reproducibility and damaging reputations. It sits at the intersection of trends in AI tooling, peer-review capacity, and evolving publisher policy.

Author style

Punchy: Consider this a red flag. Researchers, editors and funders need to take the detail seriously — sloppy AI use can invalidate work and spread bad science.

Why should I read this?

Short and informal: AI is making up sources and some researchers are using that output. We read the mess so you don’t have to — but if you rely on ML papers for work, policy or research, dig into the full story.

Source

Source: https://go.theregister.com/feed/www.theregister.com/2026/01/22/neurips_papers_contaiminated_ai_hallucinations/