Can You Prove a Source Doesn’t Exist? Librarians Are Trying

‘Wild Goose Chase’: When AI Cites Books That Were Never Written..

The420 Web Desk
6 Min Read

As artificial intelligence systems flood academia with plausible-sounding but nonexistent sources, librarians and researchers are spending growing amounts of time chasing records that were never real an unintended consequence of automation reshaping how knowledge is produced and verified.

A Rising Tide of Plausible Falsehoods

In reading rooms and reference desks across universities, a new kind of research query has become familiar: a request for a journal article, archival document, or historical record that sounds authoritative, is formatted correctly and does not exist.

Librarians describe these requests as increasingly routine, often arriving with confidence and detailed citations. The sources are usually attributed to respected publishers or archives, complete with publication years and catalogue numbers. Yet when searched, they vanish. No database yields them. No archive holds them. Researchers say the problem is not that the information is obscure. It is that it is fabricated.

FCRF Launches Flagship Compliance Certification (GRCP) as India Faces a New Era of Digital Regulation

The volume of such requests, librarians say, has grown alongside the rapid adoption of large language models in academic work. These systems can generate fluent prose and convincing references at speed, but they lack a core research skill: the ability to confirm that a record exists at all.

“Because of the amount of slop being produced,” one researcher wrote on the social media platform Bluesky, “finding records that you know exist but can’t necessarily easily find without searching has made finding real records that much harder.”

When Hallucinations Enter the Scholarly Record

Artificial intelligence “hallucinations” the confident generation of false information are now a well-documented phenomenon. What is newer is how deeply they are seeping into academic pipelines.

Researchers across disciplines have been caught submitting papers with AI-generated citations. Some of these references point to journals that sound credible but were never published; others cite articles with plausible titles and authors that were never written. In some cases, entire bibliographies have been quietly invented.

The irony is particularly stark within the field of artificial intelligence itself. As scholars race to publish on machine learning, some are producing upward of dozens or even more than a hundred papers a year. Critics say this pace encourages reliance on automated tools that can draft text and references quickly, but not accurately.

The result, librarians argue, is a feedback loop: fabricated sources circulate, are repeated, and then become harder to distinguish from authentic but obscure material. Genuine scholarship risks being buried beneath what one librarian described as “statistically convincing nonsense.”

Librarians on the Front Lines

For librarians, the shift has transformed routine reference work into a process of disproving ghosts.

“At least 15 percent of the emailed reference questions we receive are now clearly AI-generated,” said Sarah Falls, chief of researcher engagement at the Library of Virginia.

Many of them, she said, include hallucinated primary source documents and published works.

“We’re being asked to hunt down records that never existed.”

The difficulty lies not only in failing to find a source, but in proving its absence.

“For our staff, it is much harder to prove that a unique record doesn’t exist,” Ms. Falls said.

In traditional research, uncertainty often signals the need for deeper digging. With AI-generated citations, uncertainty may simply mean the source was invented. Other librarians report similar experiences. One scholarly communications librarian wrote online that after searching unsuccessfully for several cited works for a student, the student eventually admitted the list came from Google’s AI-generated summary.

“As a librarian who works with researchers,” another added, “I can confirm this is true.”

Warnings From Archivists and Aid Groups

The risks of fabricated references extend beyond academia. The International Committee of the Red Cross, which maintains extensive archives used by historians, lawyers, and humanitarian workers, has issued warnings about the use of AI chatbots for archival research.

“These systems cannot indicate that no information exists,” the organization said in a statement. “Instead, they will invent details that appear plausible but have no basis in the archival record.”

Because such systems generate responses by identifying patterns in data rather than verifying sources, they may fabricate catalogue numbers, descriptions of documents, or references to platforms that were never real. In high-stakes contexts such as legal proceedings, humanitarian investigations, or historical accountability such errors can have serious consequences.

Meanwhile, technology companies continue to market increasingly powerful “reasoning” models aimed at researchers. OpenAI, for example, has promoted tools designed to conduct “deep research” at the level of a professional analyst. While the company has said these systems hallucinate less frequently than earlier versions, it has also acknowledged persistent difficulty distinguishing authoritative information from rumor and signaling uncertainty clearly.

For librarians and scholars, the concern is less about intent than about scale. As AI-generated content multiplies, authentic human-written sources already competing for attention in an environment shaped by speed and volume risk being drowned out. And the labor of sorting truth from fiction, once assumed to be automated away, is increasingly falling back onto human experts tasked with defending the boundaries of the record itself.

Stay Connected