AI tools are now allowing so-called paper mills to trick journals with fake articles on an industrial scale. The issue risks scientific integrity and is beginning to get high level political attention
US lawmakers have warned that fake research papers risk compromising trust in the entire scientific system, as artificial intelligence (AI) makes it ever easier for so-called paper mills to fool journals into accepting made up articles.
Some estimates suggest hundreds of thousands of fake papers could exist in the human genomics literature alone. Paper mills have also managed to impersonate guest journal editors to wave through hundreds of their own fraudulent articles.
“The automation arms race is upon us,” warned Democrat congressman Bill Foster in a hearing of the US House of Representatives’ science, space and technology committee last week.
To prove his point, Foster, along with another representative, created a fake nuclear physics paper using a text generator that easily evaded plagiarism detectors.
“The creation of hundreds of papers – complete with figures and citations – becomes the work of an afternoon, much to the disgust of real scientists who might spend months on a single paper,” said Foster.
Fraud in scientific work will undermine honest academics’ work, and could have “disastrous” effects if it ends up influencing policy or public behaviour, warned congresswoman Eddie Bernice Johnson, who chairs the committee.
Concern around paper mills has existed for at least a decade, but improvements in AI image and text generation have made fake paper production possible on an industrial scale. Now, with last week’s congressional hearing, there is high level political attention focused on what it could do to the scientific system.
“This is the first time the US Congress has been interested,” said Chris Graf, research integrity director at scientific publisher Springer Nature, and one of those who testified at the hearing, told Science|Business.
Paper mills use AI tools to create realistic-seeming papers that are submitted to journals, normally many at a time. They then sell authorship spots on those journals to academics willing to pay for authorship. This might then earn the academic a bonus or promotion.
One Latvia-based paper mill uncovered by a publishers’ report last month claims to have sold close to 13,000 articles into real journals. But Graf said publishers often didn’t know where the mills were based because they conceal their locations.
As for the mills’ customers – academics willing to pay for a fake authorship credit – they tend to be based in Asia, Graf said. “The majority of the authors whose papers I know about […] are in China,” he said. “A proportion them are also in India.”
Graf, who is one of the leaders of a publishers’ effort to better detect paper mills, said he is angry with the mills, not the academics themselves.
These academics – often busy clinicians - may be trapped in a system that demands they publish internationally but without the time, resources, or English language skills to do so honestly, he said.
In some fields, research is so specialised that journal peer reviewers simply cannot spot when a paper is fake. In human genetics, sometimes there are no expert communities for particular genes, said Jennifer Byrne, an oncology professor at the University of Sydney, another expert who testified last week.
“If the manuscript looks plausible, for example by including the expected types of images and nucleotide sequence reagents, they may accept the manuscript, even though they don’t know whether the results describe represent a genuine advance in knowledge,” she said.
Bryne and colleagues recently screened around 12,000 human genomics papers, and found over 700 contained errors indicating they could originate from a paper mill. “The threat of paper mills to scientific publishing and integrity has no parallel over my 30-year scientific career,” she said in her testimony.
Another paper mill technique relies on full-scale identity fraud.
Journals often use outside guest editors to oversee niche editions, said Graf. In one case, however, a journal was contacted by a paper mill pretending to be a specialist guest editor, using a very similar email address to the academic they were impersonating.
This fake guest editor then waved through hundreds of articles from the paper mill before the journal realised what was happening and retracted the articles, Graf said. “We realised the journal had been hacked, effectively.”
Paper mills have also targeted pre-print servers. These publish articles online prior to peer review, and became increasingly important during the pandemic, when scientists couldn’t wait for publication to get their results out.
ArXiv, a popular pre-print server, has received fake papers for some time, said Steinn Sigurdsson, scientific director. “It is a bit of a ‘Red Queen Race’ as computer technology and algorithms improve and the fakes become ‘deeper’,” he said, referring to a situation where more and more work is required just to stay in the same place.
“arXiv uses a combination of human moderators who do mostly quick screens of submissions, and a hierarchy of software tools that run both synchronously with submission and asynchronously offline to detect submissions with errors or problems, and outliers of various sorts,” he explained.
Exactly how many fake papers lurk in the scientific literature is unclear.
In the publishers’ report last month, called Paper Mills, estimates varied widely. At most journals, just 2% of submissions are from paper mills, it found. But when a paper mill finds a journal that takes its papers, it appears to bombard it with manuscripts. At some targeted journals, nearly half of submissions are from paper mills, it found.
One unnamed publisher cited in the report discovered that between 2019 - 21, they had published more than 450 paper mill articles.
Publishers have faced criticism of inaction when issues are flagged up, either due to paper mill fakery or other forms of misconduct.
“Journal and institutional responses remain, on average, very ineffective and inefficient, even when evident problems have been made public,” US lawmakers were told by Brandon Stell, co-founder of the PubPeer site, which allows anonymous commentators to pick apart problems in papers.
It’s also unclear to what extent fake papers in the literature will actually steer science down blind alleys. Graf thinks that researchers are on the whole able to spot them. “They know what’s rubbish, and they know what’s trustworthy,” he said.
Still, the existence of fake paper mill articles has rattled publishers, said Graf. “Publishers care a lot about this,” he said. “It goes straight to the heart of the thing we call our value proposition.”
Last December, the International Association of Scientific, Technical and Medical Publishers, set up an Integrity Hub, which they will use to share data on simultaneous submission of papers to several journals, which can be a sign that a paper mill is behind them.
Graf predicted a spike in retractions due to increasing paper mill fraud. “Maybe we’re at the start of that spike for retractions for paper mill type work,” he said.