ScienceGuardians

ScienceGuardians

Did You Know?

ScienceGuardians hosts editors too

Can AI replace humans? Comparing the capabilities of AI tools and human performance in a business management education scenario

Authors: Dinuka B. Herath,Egena Ode,Gayanga B. Herath
Journal: British Educational Research Journal
Publisher: Wiley
Publish date: 2025-1-2
ISSN: 0141-1926 DOI: 10.1002/berj.4111
View on Publisher's Website
Up
0
Down
::

You state that five human essays were selected using a “simple random technique” from an existing population of student work. Given that these essays were likely written for an actual assessment, to what extent did prior instructor feedback, grading, or iterative drafts influence their final form and quality, potentially creating an unfair comparison against newly generated, single-attempt AI essays?

Table 3 shows a stark discrepancy between the First Marker’s significant ANOVA results (p=0.006) and the Second Marker’s non-significant results (p=0.272). Why was a measure of inter-rater reliability (e.g., Cohen’s Kappa, ICC) not reported and addressed prior to aggregating scores or drawing conclusions? Does this large discrepancy invalidate the reliability of your grading rubric?

Your study used ChatGPT 3.5 and Bard. Given the rapid iteration of AI models (e.g., GPT-4, Claude), how do you justify the generalizability of your findings to “current AI tools”? Furthermore, for the human-written essays, was any equivalent “prompt engineering” or iterative drafting process documented and controlled for?

Your results show that rewriting tools reduce AI detection scores. Did you control for the possibility that these tools also significantly alter text quality, complexity, or coherence? Is the reduced detection a function of evading AI patterns, or simply of degrading the text to a point where it becomes less identifiable as well-written content from any source?

You conclude with a strong recommendation for “veracity-based AI teaching methods.” Since your study’s design did not test this or any pedagogical intervention, on what empirical basis from your results can this specific prescription be justified, rather than being presented as a speculative, literature-based suggestion?

  • You must be logged in to reply to this topic.