Monday, February 25, 2008


It's the name of the software tool that helped in nailing Chiranjeevi, the most recent plagiariser / fraud to be outed. Here's a quote from the C&EN story:

ONE TOOL that Dasgupta has used to find reviewers—and that might be useful in discovering plagiarism—is a Web-based tool called eTBlast. Developed by computational biologists at the University of Texas Southwestern Medical School, the free service does a similarity search of text that someone inputs with papers in Medline or other online databases. Dasgupta and others say it could be a powerful tool for weeding out plagiarism in journal manuscript submissions.

The developers of eTBlast have now developed a duplicate submission database called Deja vu. Both are available for free, eTBlast at and Deja vu at

The first time I heard about eTBlast was when Sunil wrote a detailed post about it some ten weeks ago. It looked very impressive, but I had no way of test-driving it in my field (materials science and engineering, in case you are wondering), so I never got around to writing about it. As of now, it seems to cover both biological and bio-medical fields very well. Correct me if I am wrong here, but it appears to me that its coverage of physics and related fields is not at a stage where it can be used routinely and reliably.

But what it has achieved so far has been tremendous. A study conducted using eTBlast has been published in Nature under the title "A tale of two citations." It probably requires subscription, but Scientific American carries a summary:

A new computerized scan of the biomedical research literature has turned up tens of thousands of articles in which entire passages appear to have been lifted from other papers. Based on the study, researchers estimate that there may be as many as 200,000 duplicates among some 17 million papers in leading research database Medline.

The finding has already led one publication to retract a paper for being too similar to a prior article by another author.