Back in 1998, Australian oncologist Jennifer Byrne was among the first to clone a cancer gene that is associated with breast cancer and the type of leukemia most common in children. Two years ago, when Byrne came across mentions of the gene, called TPD52L2, in five papers from separate authors, something didn’t seem quite right.
In one paper, for example, researchers said they inactivated the gene to observe the results on cancerous cells. Since she knew the gene very well, Byrne, now at the University of Sydney, soon realized the researchers used the wrong sequence. The consequences of such papers could be dire, Byrne told US blog Retraction Watch in January, because cancer studies like these are often the start of expensive research pipelines that hope to find better treatment for patients. Two of those five papers have since been retracted, while, according to Nature, another two were expected to be retracted in November. Digging further, Byrne and computer scientist Cyril Labbé found 48 very similar papers—all from Chinese researchers—63% of which had used the wrong DNA sequences.
Now, Byrne and Labbé, of France’s University of Grenoble Alpes, have invented a tool to help do the kind of sleuthing they did manually, and weed out problematic cancer research. So far, Byrne told Quartz, the tool helped them find 60 papers with incorrect sequencing, prompting them to make it available for public use in October. It’s still in a testing phase.
Researchers can upload their papers to the platform, called Seek&Blastn which works by comparing gene sequences in the paper to those stated in a widely-used scientific database to spot errors. The platform gets its name from that database, known as the Nucleotide Basic Local Alignment Search Tool, or Blastn for short, in use since 1985 for biological sequence matching. “If the true identity of the sequence doesn’t match its stated identity, it can’t have been used correctly,” said Byrne via email.
But the platform is no quick fix. It can be only applied to human genetic sequence detection. It also requires a certain amount of biomedical technical expertise and further manual checking, especially if the descriptions of the gene sequences in the papers being analyzed aren’t clear, Labbé said.
Labbé told Quartz that in the public phase the platform has so far examined about 300 papers, mostly cancer-related. The researchers say they hope that the tool can be used to detect incorrect studies in other research areas, but extending the analysis to non-cancer papers is difficult right now due to limited resources, such as access to experts in related fields, Byrne said.
While scientific publications are important foundations for subsequent treatments, in the area of medicine researchers often face pressure to publish during the long peer-review process. It’s not unusual for scientists to spend years in research before publishing. And it’s often challenging for publishing groups to catch all the mistakes with an increasing number of studies being published every year. In 2015, Labbé led a team to develop software called SciDetect to spot fraudulent papers for publishing group Springer, after the publisher found it had accepted 18 articles generated by an automated tool, developed intentionally to test publishing practices.
In a high-profile retraction case this April, over 100 studies in Tumor Biology, published by Springer Nature from 2012 to 2016, were found to have problems with peer review, an important step in scientific publishing to determine the credibility and significance of work.
Read next: China publishes more science research with fabricated peer-review than everyone else put together