Risk Assessment & AI Detection

Understanding assessment vulnerabilities and the limitations of detection-based approaches.

Risk Assessment

Assessments that involve the creation of an artefact as the main or only task typically pose a higher level of risk. Similarly, assessments that have little oversight such as unsupervised online assessments also pose a high level of risk. At the opposite end of the scale are fully supervised or proctored exams which have a lower risk level, however this approach can impact on authentic assessment design and contribute to an environment of mistrust rather than a culture of integrity depending on the context and use.

Risk Assessment Table

Assessment Type Level of Risk Risks Posed by GenAI Mitigation Steps
Essays and Written Assignments HIGH AI can generate high-quality written content that may not be easily detected as non-original.
  • Require multiple drafts and incorporate peer reviews.
  • Employ oral defences or follow-up questions to verify understanding.
Unsupervised Open-Book or Remote Exams HIGH Students might use AI to complete their exams, leading to misrepresentation of their own knowledge.
  • Implement time constraints that limit the ability to use AI.
  • Use a combination of unsupervised open-book and in-person assessments.
  • Randomise questions and personalise them to individual students.
Online Quizzes MEDIUM AI can assist in answering questions, especially multiple-choice ones, if they are available online.
  • Use question banks to randomise questions.
  • Employ proctoring software that monitors students.
  • Incorporate questions that require critical thinking and personalised responses.
Research Papers MEDIUM AI can generate or heavily assist in creating research papers, making it hard to detect authentic student work.
  • Require detailed methodology sections and data analysis.
  • Conduct oral presentations of research findings.
  • Require pre-final drafts and incorporate peer reviews.
Lab Reports MEDIUM AI can help generate content for lab reports, including data interpretation and discussion sections.
  • Require students to submit raw data and detailed lab notes.
  • Incorporate in-lab assessments and practical exams.
  • Conduct regular checks and comparisons with past student work.
Creative Work MEDIUM AI can help produce content for many creative disciplines including music, graphic design, visual art, and poetry.
  • Require submission of notes and drafts or sketches.
  • Use oral presentations and Q&A sessions to verify individualised approach.
  • Conduct comparisons with past student work.
Problem Sets LOW While AI can solve problems, students still need to understand the process and concepts.
  • Include a mix of automated and hand-written problem-solving.
  • Regularly update problem sets.
  • Use oral exams to verify understanding.
Group Projects LOW AI can assist in parts of the project, but collaboration and presentation skills are difficult to fake.
  • Assess individual contributions through peer evaluations.
  • Incorporate regular check-ins and progress reports.
  • Require live presentations and Q&A sessions.
Oral Presentations LOW AI cannot assist directly during live presentations, but can aid in preparation.
  • Focus assessment on delivery, understanding, and ability to answer questions.
  • Use varied formats like impromptu topics or interactive Q&A.
  • Require submission of notes and drafts.

In addition to the mitigation steps above, a variety of assessment media such as journals, e-portfolios, vlogs or blogs can be selected to accompany larger pieces of work or as stand-alone assessments. Activities involving critical thinking, decision-making, and reflection are more difficult for GenAI to simulate.

AI Detection: Limits, Risks, and Appropriate Use

A range of AI detection tools have been developed in response to the increasing availability of GenAI. However, current evidence indicates that these tools produce both false positives and false negatives and cannot reliably determine authorship or intent. As a result, they are not suitable for use as primary or definitive evidence of academic misconduct.

There are also significant equity and fairness concerns associated with AI detection. Research suggests that such tools may disproportionately affect multilingual writers and students whose writing does not align with dominant linguistic norms.

Appropriate Use of Detection Tools

Within this framework, AI detection tools are understood as having, at most, a limited and contextual role.

They may be used as:

  • Conversation starters to prompt discussion with students
  • One indicator among many when reviewing assessment submissions
  • A means of supporting educational dialogue rather than enforcing punitive measures

They should not be used as:

  • Sole or decisive evidence in academic misconduct cases
  • A substitute for robust assessment design
  • A proxy for evaluating learning or understanding

Effective academic integrity practice in an AI-enhanced environment depends primarily on structural assessment redesign. Assessments that make learning processes visible — through staged tasks, reflective elements, oral components, and evidence of decision-making — reduce reliance on detection technologies and strengthen the validity of assessment judgements.

Postgraduate Programmes

While postgraduate work is often more closely supervised, the above may be adapted and applied where appropriate.

  • Oral examinations and presentations with Q&A sessions can be increased throughout pinch points in the research journey
  • The requirement of a reflective journal documenting the thought process is valuable both from a learning perspective and to support integrity
  • Active learning such as collaboration with peers through group project work can lower the risk of AI misuse
  • Working with industry for problem solving and co-creating is a robust approach
  • Facilitating a workshop on GenAI and research at an early stage could contribute towards creating a culture of transparency and integrity
  • Implications require more time investment particularly for large cohorts; needs consideration in resourcing