You write :
“ figures were further examined for evidence of image duplication or manipulation by using the Adjust Color tool in Preview software on an Apple iMac computer. No additional special imaging software was used.”
This is an initial bias, but one which has consequences for the study as a whole. You’re making a comparison based on gauging things on screen. What are the settings of your screen? In an experiment, we give you the starting point for judging, but you determine duplications by eye.
Please use other software, like gimp it’s free… or opencv library free and open source also. By the same token, you admit that you haven’t even zoomed in to control the thing!
3.8 % :
This percentage seems to me to be entirely consistent with human error. So you accuse others of making mistakes, when in fact you don’t make any on perhaps more than 40,000 images (20,000 articles with 2 images per article, and even that’s a low number).
More at this url : https://chroniquecolibri.wordpress.com/2024/01/12/critique-du-papier-delisabeth-bik/
Apart from whether the errors in the figures have been correctly identified, the article’s conclusions rest on a highly flawed and non-random sampling design that undermines its statistical validity.
The authors claim to examine the “prevalence” of problematic images, a term that carries quantitative weight, yet their journal and article selection lacks even the most basic elements of randomization or stratification. Instead, they rely on a haphazard mixture of keyword searches (“Western blot”), convenience sampling (screening only the first 40–50 articles returned), and a massive overrepresentation of a single journal (PLOS ONE, contributing approximately 40% of the dataset). This introduces severe selection bias and renders the reported prevalence figures essentially anecdotal.
More troubling is that this methodological limitation is not framed as a weakness in the abstract or title, but rather buried in the methods section, potentially misleading readers who assume this is a statistically rigorous study. If this were a systematic review or meta-analysis, such non-random sampling would disqualify it from publication in any reputable journal.
The absence of a clear, replicable selection framework makes the findings non-generalizable, and the use of the term “prevalence” is scientifically inappropriate.
At best, this paper serves as a qualitative call for further investigation. At worst, it spreads inflated estimates of scientific misconduct based on a methodologically compromised dataset.