Designing a Group Essay Assignment in the Age of AI

Yuzuru Kumon
Department of Economics
The University of Manchester

Published April 2026

Context
Designing the Guidance
Redesigning the Assignment
Conclusions
References

Context

I teach an undergraduate module in economic history with a large group essay component (60% of the grade). This was a challenge in the context of the rapid adoption of Generative AI (from here on referred to as “AI”). This was a third-year undergraduate course with over 70 students. The group essay involved students forming groups of 3-4 of their own choosing and writing 3,000 words by the end of the semester. Students could choose any question in economic history, allowing me to cater to a diverse set of interests. Since students wrote their essays at home, there was a clear concern about the use of AI which affects the credibility of assessments. When I first taught this in the Spring of 2025, it had been less than 3 years since the release of ChatGPT in November 2022. I was caught in a moment of transition with inconsistent and fragmented guidance within Manchester and the profession (Chaudhury et. al., 2025; Cortinhas et. al., 2025). Therefore, my colleagues and students were still in the process of learning this technology. Yet the course lacked any guidance on how to incorporate this technology.

There were clear challenges to continuing the group essay. In my first year of teaching the course, I found that average essays had implausibly good writing but lacked engagement with specific evidence related to the students' chosen topic. I also found hallucinated references in some essays that proved the use of AI. With some personal tinkering, I found the lack of intellectual depth in specific topics a common feature of AI. Fortunately, it remains difficult to plagiarize with AI and to write anything beyond a poor essay which gave me room for maneuver.

Nonetheless, I was keen to teach a course that required research-based active learning due to the benefits of learning based on use/invention of concepts rather than memory (Bain, 2021). This is especially important within the context of economics departments where research-based long-form essays constitute a shrinking share of assignments (Watts & Schaur. 2011; Harter et. al., 2021). A further benefit was to teach students AI skills that are now in demand at work among highly skilled professionals (Chatterji et. al., 2025) including among economists (Nassehi et. al., 2025). However, I received some pushback from colleagues who believed take-home assignments were redundant due to AI.

[T]he most powerful teaching moments came from a tutorial where we analyzed the answer given by AI to a question based on the assigned readings.

Designing the Guidance

The new central concept behind the essay assignment was to embrace AI. I would adapt the guidance to help students use AI to enhance learning while safeguarding them against misuse. Considering surveys show students continue to use AI even if it is prohibited, this seems the most prudent approach (Bharadwaj, 2023). Ideally, students would learn the most essential employability skill which is the “meta-competence of learning to learn with GenAI” according to a recent study by economists (Nassehi et. al., 2025). Instead of staying passive about AI usage, students were strongly encouraged to use AI. The expectation was that the average essay quality would be above those of previous cohorts.

In the first lecture, I introduced the group essay and framed AI as a tool that must be used to enhance learning. To give guidance on ethical usage, I introduced the FEAL principles following the recommendation by Slade et. al. (2025). AI should be used if it was Faster relative to working without AI, Ethical (i.e. not plagiaristic), Accurate (i.e. check the AI’s work), and improved Learning. More intuitively, I explained that students need to “be in control” when using AI and the final essay must reflect their own critical thinking, not that of AI. If they were not in control, it was plagiarism.

I used these principles over the University of Manchester’s policies in the AI hub (Transparency, Accountability, Competence, Responsible Use, and Respect), because words such as “competence” and “respect” lack operational clarity, a problem seen across a wide range of universities (Chaudhury et. al., 2025). Nonetheless, I made sure that my guidance was in line with that of the university.

I next followed up with a class discussion where we shared recommendation on prompts with others. I strongly encouraged two specific uses of AI. First, they should use AI for essay feedback in addition to peer-to-peer feedback as recommended by Sperber et. al. (2025). Second, they should use AI to check spelling or grammar mistakes. This was especially suitable for those with relatively low proficiency in English. Since many students were international students, some of whom were vocal about concerns in essay writing, this also made the course more welcoming and inclusive.

Finally, the most powerful teaching moments came from a tutorial where we analyzed the answer given by AI to a question based on the assigned readings. As a group, we analyzed the answer and identified errors and superficial elements of the essay. Given the students were already armed with specialist knowledge from their readings, the students quickly realized that while the prose sounded competent, it lacked depth and historical evidence. I then asked the student to grade the essay. Most students gave the essay a grade that was well below 60. This often translated into a “satisfactory” or “fail” grade. This had a profound effect on students who discovered the limits of AI.

Redesigning the Assignment

Beyond the guidance, I also changed the structure of the assignment to better guide and monitor students. This was based on four checkpoints within the semester where I give group-level feedback. Since the checkpoints were time intensive, I reduced the course student cap after consulting the head of teaching.

The first checkpoint was in the third week of lectures when all groups booked a 20-minute individual meeting with the instructor to discuss potential group essay topics. This forced students to verbally communicate their ideas which is difficult for people who rely solely on AI. This also helped me to flag potential concerns on AI misuse. Further, it allowed me to steer students towards specific historical topics which are suitable for a 3,000-word essay. The specific nature of the question also made it more difficult for AI alone to provide good answers.

The second checkpoint was the research proposal that was submitted in the 5^th week of the semester. This is of a low-stakes pass/fail format (10% of grade) that serves two purposes. First, since it is a summative assessment, this forces students to begin collaborating and working well before the final deadline. This reduces the temptation of using AI in the last minute once it is too late to begin writing the essay. Second, it is an opportunity to guide students towards high-quality literature. This supplements knowledge from AI which is general but not specific.

The third checkpoint was the required “response letter” to the instructor's feedback on the research proposal. This had to be submitted with the final essay. The feedback was generally designed to be about specific historical evidence. Importantly, AI is certainly not currently equipped with the knowledge to effectively respond to these questions. Therefore, the response letter ensured that students engaged with the feedback and conducted their own research.

The fourth checkpoint was a short AI use statement to be submitted with the final essay. The purpose was not to catch plagiarism since students rarely self-incriminate themselves. Rather it served two purposes. First, it provided a moment of reflection for students to think about how they interacted with AI and the ethics of their actions. Second, it encouraged students to be transparent about the use of AI. This will be important in workplaces.

The group essay format had one major risk that plagiarism by one group member can lead to a fail grade for the entire group. However, I added two specific safeguards against concerns for group work. First, students now had an internal deadline (two weeks before the final deadline) to disseminate their individual contributions to the group essay. This allows students to practice a key principle in AI use – the human check of all work including those done by others. Second, I created an internal anonymous evaluation of other group members after completing the assignment using a program called Buddycheck. Specifically, each of the group members assess the contributions of other group members using this online tool. If a group member is judged to have done more/less than others, this is reflected in a higher/lower grade. Although this was initially designed to discourage free riding in general, this could also be applied to detect and punish plagiaristic use of AI among peers.

Conclusions

Compared to the previous cohort, the essays from this cohort were generally of a higher quality in content and writing. This fulfils one aim of integrating AI into education. Further, AI would not have been able to write these essays due to the specificity of topics. Nonetheless, it remains a future concern that AI will improve to the point that I need another redesign of the assessment.

References

Bain, K. (2021). Super Courses: The Future of Teaching and Learning. Princeton University Press. https://doi.org/10.1353/book.127038

Bharadwaj, P., Shaw, C., NeJame, L., Martin, S., Janson, N., & Fox, K. (2023). Time for class 2023: Bridging student and faculty perspectives on digital learning. Tyton Partners. https://tytonpartners.com/app/uploads/2023/06/Time-for-Class-2023-Report_Final.pdf

Chaudhury, P., Mele, A., Cortinhas, C., Hawkes, D., Jenkins, C., Nassehi, R., Dal Bianco, S., & Paredes Fuentes, S. (2025). Academic integrity in a GenAI world: A comparative review of university policies in the UK (CTaLE Working Paper 3). Centre for Teaching and Learning Economics (CTaLE). https://ctale.org/working-paper-3-academic-integrity-in-a-genai-world/

Chatterji, A., Cunningham, T., Deming, D. J., Hitzig, Z., Ong, C., Shan, C. Y., & Wadman, K. (2025). How people use chatgpt (No. w34255). National Bureau of Economic Research. https://doi.org/10.3386/w34255

Cortinhas, C., Chaudhury, P., Hawkes, D., Jenkins, C., Nassehi, R., Paredes Fuentes, S., Dal Bianco, S., & Mele, A. (2025). How are economists adapting assessments for a GenAI world? (CTaLE Working Paper 2). Centre for Teaching and Learning Economics (CTaLE). https://ctale.org/working-paper-2-how-are-economists-adapting-assessments-in-a-genai-world/

Harter, C., Chambers, R. G., & Asarta, C. J. (2021). Assessing learning in college economics: A sixth national quinquennial survey. Eastern Economic Journal, 48(2), 251. https://doi.org/10.1057/s41302-021-00205-8

Nassehi, R., Jenkins, C., Dal Bianco, S., Mele, A., Hawkes, D., Paredes Fuentes, S., Cortinhas, C., & Chaudhury, P. (2025). How economists are using GenAI at work: The new employability skills (CTaLE Working Paper 1). Centre for Teaching and Learning Economics (CTaLE). https://ctale.org/working-paper-1-how-economists-are-using-ai-at-work/

Slade, J. J., Byers, S. M., Becker-Blease, K. A., & Gurung, R. A. (2025). Navigating the new frontier: Recommendations to address the crisis and potential of AI in the classroom. Teaching of Psychology, 52(3), 254-261. https://doi.org/10.1177/00986283241276098

Sperber, L., MacArthur, M., Minnillo, S., Stillman, N., & Whithaus, C. (2025). Peer and AI Review+ Reflection (PAIRR): A human-centered approach to formative assessment. Computers and Composition, 76, 102921. http://doi.org/10.2139/ssrn.5066838

Watts, M., & Schaur, G. (2011). Teaching and assessment methods in undergraduate economics: A fourth national quinquennial survey. The Journal of Economic Education, 42(3), 294-309. https://doi.org/10.1080/00220485.2011.581956

↑ Top

Other teaching ideas in

Contributor profiles