Overview
Recent developments in GenAI point towards increasingly agentic systems — AI tools that can plan, act, iterate, and make decisions across multiple steps with limited human input. Unlike earlier generative tools that respond to discrete prompts, agentic AI systems can autonomously break tasks into sub-goals, retrieve and synthesise information, generate outputs, revise those outputs, and execute workflows on a user's behalf.
These developments raise new considerations for assessment design. As AI systems become more capable of managing extended cognitive processes, the boundary between assisted work and outsourced thinking becomes increasingly blurred.
Emerging Challenges
Agentic AI introduces a number of challenges that extend beyond those associated with earlier generative tools:
Diminished visibility of student thinking — where planning, decision-making, and iteration are performed by the AI rather than the learner
Further erosion of traditional authorship assumptions — particularly in assessments that involve extended projects, research, or problem-solving workflows
Increased difficulty distinguishing contribution — where students may oversee or curate AI-driven processes without engaging meaningfully with the underlying learning
Acceleration of task completion — potentially undermining assessment designs that rely on time, effort, or procedural complexity as proxies for learning
These challenges reinforce the limitations of detection-based approaches and highlight the need for assessment designs that prioritise human judgement, intentionality, and reflection.
Assessment Design Responses
Rather than attempting to prohibit or police agentic AI use, effective mitigation lies in structural assessment redesign that clarifies where human learning must be demonstrated.
Emphasising Human Decision-Making and Justification
Assessment tasks should require students to explain, justify, and critique decisions made throughout the learning process, including decisions about whether and how AI tools were used. This shifts assessment from output generation to evaluation, judgement, and accountability.
Designing for Process Transparency
Multi-stage, process-oriented assessments — including planning documents, annotated drafts, reflective commentaries, and oral explanations — make it more difficult for agentic systems to fully substitute for student engagement.
Integrating Critical Reflection on AI Agency
Where AI tools are permitted, assessments can explicitly ask students to reflect on: What tasks were delegated to AI; What limitations or errors were identified; Where human judgement overrode AI-generated suggestions.
Prioritising Tasks Requiring Situated Human Context
Assessments that draw on lived experience, disciplinary interpretation, ethical reasoning, professional judgement, or contextual constraints remain less amenable to full automation.
Maintaining Deliberate AI-Restricted Assessment Spaces
As agentic capabilities increase, it becomes increasingly important to retain some assessment contexts where independent human performance is required — to verify foundational knowledge, disciplinary understanding, and professional competence without AI mediation.
Implications for AI Literacy
The emergence of agentic AI reinforces the need to broaden definitions of AI literacy. Beyond evaluating outputs, students must also learn to:
- Recognise when AI systems are assuming decision-making roles
- Understand the risks of cognitive offloading and over-reliance
- Reflect on responsibility, agency, and authorship in AI-mediated work
- Appreciate the continued value of human judgement, creativity, and ethical reasoning
Assessment therefore plays a crucial role in helping students develop discernment, not just technical proficiency.
GenAI Vs. Agentic AI
The following table compares earlier GenAI with emerging agentic AI and their implications for assessment design.
| Earlier GenAI | Emerging Agentic AI | Assessment Design Implication |
|---|---|---|
| Responds to single prompts | Plans, sequences, and executes multi-step tasks | Assessment must reveal decision-making, not just outputs |
| Produces discrete artefacts (text, code, images) | Manages extended workflows and iterations | Greater emphasis on process transparency |
| Supports idea generation or drafting | Delegates planning, synthesis, and revision | Risk of outsourced thinking increases |
| Learner initiates and directs each step | AI can act semi-autonomously | Need to foreground human agency and judgement |
| AI use visible in isolated moments | AI involvement may be embedded and opaque | Require explicit reflection on AI role and use |
| Authorship already blurred | Authorship further destabilised | Clarify expectations around responsibility and accountability |
| Time/effort still loosely correlated with learning | Tasks completed rapidly with minimal engagement | Avoid using effort or complexity as proxies for learning |
| Detection tools already unreliable | Detection becomes increasingly ineffective | Shift fully toward structural redesign |
| AI literacy focused on evaluating outputs | AI literacy must include recognising AI agency | Assessment should develop discernment, not dependence |
Looking Ahead
Agentic AI represents a further shift in the educational landscape, but it does not invalidate the principles of good assessment. On the contrary, it reinforces the importance of validity, transparency, process visibility, and alignment with learning outcomes.
By designing assessments that make human thinking explicit, require reflection on AI use, and prioritise judgement over generation, educators can respond proactively to emerging AI capabilities while maintaining academic standards and educational purpose.