A recent study at the University of Reading shows that AI-generated exam answers can outsmart human markers and achieve better grades, raising concerns about academic integrity in higher education.
A recent study conducted by academics at the University of Reading has demonstrated that exam submissions generated by artificial intelligence (AI) using GPT-4 can evade detection and earn higher grades compared to those submitted by students. The study, published in the journal PLOS ONE, involved generating answers for exam questions and submitting them on behalf of 33 fictitious students within the School of Psychology and Clinical Language Sciences.
Exam markers, unaware of the experiment, were unable to detect AI-generated answers in 94% of cases. These AI-generated submissions received higher average grades, particularly in the first and second years of study, though they performed less well in the final year.
This discovery comes against the backdrop of Russell Group universities, which include Oxford University, allowing the ethical use of AI in teaching and assessments. The findings have sparked discussions about the academic integrity of higher education institutions and the potential challenges that advanced AI tools such as ChatGPT pose.
Associate Professor Peter Scarfe and Professor Etienne Roesch from the University of Reading emphasized the need for the education sector to adapt and refine guidance as AI technology evolves. They suggested that instead of reverting to traditional in-person exams, universities should find ways to incorporate and manage AI responsibly to maintain the integrity of educational assessments.