Closed Loops
Mark Papers
One of the seminars I taught last term was on the theme of ‘Mindless Modernism’. We read Gertrude Stein’s Tender Buttons, along with B.F. Skinner’s essay ‘Has Gertrude Stein a Secret?’, in which he suggests that Stein didn’t write her poems at all but let them emerge mechanically from her pen, like an automaton, and Alan Turing’s ‘Computing and Machine Intelligence’, in which he sets out the rules of the Imitation Game.
I had the feeling, sitting down to my marking after the Christmas break, that I was an unwilling participant in a version of Turing’s game. Was the work I was reading really written by students? Or had a machine had a hand in its construction?
There’s nothing new about suspicions of students cheating. Though straightforward plagiarism has been easy to detect for at least the last decade, it’s harder to catch students who submit work written or heavily edited by others – parents, peers or, at least for wealthier students, essay mills.
In 2022 the government passed a law making the use of ‘contract cheating’ services illegal, but that move now feels laughably redundant. A recent survey of students in the UK found that nearly 60 per cent have used ChatGPT to ‘help with their assignments’ – brainstorming ideas, correcting grammar or ‘assisting with essay structure’ – and 5 per cent of those surveyed admitted to submitting papers containing material generated by AI (this seems low to me).
Detecting the use of generative AI in students’ essays isn’t so much a question of identifying particular turns of phrase as of distinguishing a style. Usually, when students don’t do the reading or the thinking, there’s an associated sloppiness at the sentence level. Now I occasionally come across writing that is superficially slick and grammatically watertight, but has a weird, glassy absence behind it. There’s not much that can be done about the suspicions that such sentences provoke: despite the claims of some tech start-ups targeting universities, there’s no reliable way of proving that text is AI-generated, and it seems unlikely that there ever will be.
This uncertainty has become an ongoing discussion in department meetings. Some of my colleagues say we should embrace the bots: what if we were to ask students to write prompts to generate responses to questions using AI, and then get them to ‘fact check’ the answers to demonstrate just how readily the technology ‘hallucinates’ fake facts and faulty citations? Others call for a return to older tech: in-person exams, written by hand, or the use of vivas for undergraduates, or requiring students to submit drafts and plans of their work along the way.
For now, I take comfort in the thought that, while AI can be used to generate plausibly fluent boilerplate, nothing I’ve read written by a bot would achieve a mark much higher than a low 2:1. But ChatGPT is good at the inert language found in university mission statements, rushed-off references, funding proposals and, yes, marking feedback. With many universities keen to embrace AI to ‘automate operations and improve efficiency’ (save on teaching costs?) and ‘enhance student learning outcomes’, it isn’t hard to imagine a perfect closed loop in which AI-generated syllabuses are assigned to students who submit AI-generated work that is given AI-generated feedback.
Mark Papers is a pseudonym.
Comments
Something like this: The purpose of the work is to educate and train so if it isn't done, will the resulting degree give a wrong impression of ability. Then hiring companies may start to request evidence that work was actually done or at least actually tested.
I suspect I would be being optimistic: the longer the feedback loop the weaker the control in general.
I'm not an academic, but during my Humanities PhD I helped several fellow students with their written work (including non-native English speakers, students with dyslexia, and the double whammy, a non-native English speaker with dyslexia who was an outstanding climate researcher but who couldn't get to grips with English grammar). This was at a former poly which often picked up local students that had scraped through their A-levels and wouldn't get in to most universities, at least not based on their marks (as well as excellent overseas students who didn't know what "former-poly" means). Nonetheless, many were intelligent and passionate about their studies, but struggled with written communication. Staff often lamented students who were stellar performers in seminars but wrote incoherent essays.
I certainly think that introducing vivas for undergrad work would be a good (though time-consuming) idea - it's widespread in the rest of Europe. But I also wonder if it would be possible to implement the controlled use of AI by students to make it much easier for such students to focus on developing their ideas without finding their development blocked by their inability to express themselves clearly. Of course, this will be a recursive process, because having clear ideas and being able to express ideas clearly are two different skills, but they are not completely independent. For committed students with (for whatever reason) difficulties expressing themselves clearly, these tools could be invaluable. They can work with the AI to help them present and refine their ideas. I don't think that's problematic. They will almost certainly be required to use these tools professionally when they leave university.
The difficulty remains for essay-markers who are presented with tons of superficially coherent essays written with little or no input from the student. It seems inevitable that the essay will become less important in university assessment but it surely can't disappear altogether. And distinguishing a clear but superficial student-essay from a clear but superficial AI essay probably requires that markers are not rushing through papers at 11pm because they haven't prepared their morning lecture yet and they spent all day in meetings to work out bids for research funding.
In English and Welsh universities in their current state, the closed loop described in the article sounds the most likely solution.