agentic-aigovernance-frameworkrisk-managementcapability-analysistechnical-controlsai-safety

Every AI Agent Audit Your Teams Run Is Missing the Same Thing

ARC Framework governs agentic AI through a capability lens that maps 46 risks to 88 technical controls using structured implementation guidance.

May 10, 20269 min read

Source Paper

With Great Capabilities Come Great Responsibilities: Introducing the Agentic Risk & Capability Framework for Governing Agentic AI Systems

Shaun Khoo, Jessica Foo, Roy Ka-Wei Lee · GovTech Singapore, Singapore University of Technology and Design

View Paper

The Governance Gap That Opens the Moment Your Agent Can Act

An AI coding agent at Replit wiped a production database. The company called it a "catastrophic failure." Nobody disputes what happened technically. What is harder to answer is the governance question: was there a structured process before deployment that would have identified that specific risk, rated its likelihood, and required a control? In most organisations running agentic systems today, the honest answer is no.

The gap is not a lack of awareness. Executives have read the EU AI Act. Their teams have reviewed the NIST AI Risk Management Framework. The problem is that those documents articulate principles without specifying mechanics. Telling a developer that their agentic system must be "safe, transparent, and accountable" does not tell them what to do when their agent has file management capabilities, internet access, and the ability to execute code against a live database. Conducting a bespoke, in-depth risk assessment for each agentic system is possible as an interim measure, but as the authors of a new framework from GovTech Singapore and the Singapore University of Technology and Design put it directly: it is unsustainable in the long run.

Shaun Khoo, Jessica Foo, and Roy Ka-Wei Lee published the Agentic Risk & Capability (ARC) Framework to address exactly this operational gap. The framework is not another principles document. It is a structured governance system with 46 named risks, 88 controls, three control tiers, and a two worked-example implementation guide. It is also, notably, open-sourced. There are no empirical benchmark results here, and the authors are transparent about that: the framework is conceptual with worked examples, and they explicitly name empirical validation as the required next step. What it offers instead is the first governance architecture that scales with agent capability proliferation rather than collapsing under it.

Why "What Tools Does It Use?" Is the Wrong Question

Every major governance approach before ARC organises itself around one of two anchors: the components an agent is built from, or the high-level principles an organisation should follow. Both have a scaling problem.

Component-based approaches break down because the same risk can manifest through dozens of different tools. A web search tool, a browser automation tool, and a news aggregator API all enable internet access. Auditing each tool separately is not a governance system, it is an unbounded checklist. Principle-based approaches break down because they cannot be operationalised by a developer trying to decide whether their specific system needs a human-in-the-loop before executing file deletions.

ARC reorients the analysis around a third anchor: capabilities, meaning the actions an agent can autonomously execute. This shift is more significant than it appears. When a governance team asks "does this system have Internet & Search Access?" rather than "which web tools does it use?", the answer is binary, stable, and auditable regardless of how many underlying tools implement that capability. A GitHub MCP server enables dozens of distinct actions, but as an ARC capability it maps to Other Programmatic Interfaces, a single classification that carries a defined risk set.

The framework draws on Gaver's affordance theory to formalise this distinction: components and design are the affordances that make capabilities possible; the capabilities themselves are what get governed. That separation between what enables action and what action is taken is what makes the framework scalable as agentic systems evolve.

The authors identify three reasons this lens outperforms alternatives:

It avoids the tool proliferation trap, because many tools can implement one capability and one classification handles all of them
It enables proportionate governance, because high-capability systems receive more stringent scrutiny while low-capability systems get lighter treatment without arbitrary cutoffs
It is legible to non-technical personnel, which matters when governance teams include legal, compliance, and procurement roles who will never read an MCP server specification

The Architecture That Connects Risk to Control Without Leaving Gaps

ARC organises every agentic system through three nested layers. Understanding the structure matters because the framework's practical value depends on all three functioning together.

Layer 1: Elements. Every agentic system has components (the LLM, tools, instructions, memory), design (how agents are interconnected, what roles and access controls exist, what monitoring is in place), and capabilities (what the assembled system can do). The capability taxonomy covers thirteen types across cognitive, interaction, and operational categories. Official Communication, Business Transactions, Code Execution, and System Management carry the highest inherent risk concentrations.

Layer 2: Risks. The draft Risk Register contains 46 named risks, each of which must originate from a specific element, satisfy one of three failure modes (agent failure, external manipulation, or tool/resource malfunction), and result in at least one of nine hazard categories. Risks that cannot pass all three tests are excluded. The authors are explicit that not all element-failure mode-hazard combinations are sensible, and organisations are instructed to retain only risks supported by academic research or real-world case studies.

Layer 3: Controls. 88 controls are organised across three tiers. Level 0 cardinal controls are non-negotiable requirements. Level 1 standard controls should be adopted or meaningfully adapted. Level 2 best-practice controls are reserved for high-risk systems. After controls are applied, residual risk must be evaluated. If the residual is unacceptable, additional technical and non-technical measures are required before deployment. The framework is explicit that some residual risks are inherent, such as prompt injection guardrails trained on historical jailbreaks that may not generalise to novel attack vectors.

ARC Layer	What It Defines	What Breaks Without It
Elements, three-part	The auditable surface of the system: components, design, and capabilities	Risk identification has no stable anchor; different reviewers flag different things with no consistency
Risk Register, 46 risks	A validated set of materialised risks tied to specific elements and failure modes	Each deployment triggers a blank-page risk assessment that scales with team capacity, not system risk
Control tiers, three levels	Proportionate requirements based on system risk level	Every system gets either maximal controls (unsustainable) or none (unsafe)

Where the Framework Has Not Been Tested Yet

The authors do not claim more than they have demonstrated, and that discipline is worth noting. The risk impact and likelihood ratings in the Risk Register are grounded in published research and industry case studies, not in live production data. The two worked examples, a deep research agent and a vibe coding deployment tool, illustrate how the framework applies, but they are constructed illustrations, not retrospective audits of deployed systems.

The comparison against other frameworks in Appendix A is structural rather than empirical. ARC is assessed against MAESTRO, OWASP Agentic AI Risks, Google SAIF 2.0, and Dimensional Governance on dimensions like scope, practical guidance, and control specificity. ARC compares favourably on most dimensions, but this is a design comparison, not a controlled head-to-head evaluation.

The future work statement in the paper's conclusion names two priorities: empirical validation of the Risk Register's risk and control mappings, and automated tooling to support implementation and ongoing updates. Both are significant gaps for any organisation considering ARC as a primary governance instrument. The framework provides the structure; it does not yet provide the evidence base that would let a governance team say with confidence that a given control reliably reduces a specific risk by a measurable amount.

How the Risk Register Works in Practice: Two Calibration Points

The two worked examples reveal something important about how the framework actually behaves under operational conditions.

The Researcher agent, a deep research system with three capabilities (Planning & Goal Management, Natural Language Communication, and Internet & Search Access), starts with 38 applicable risks drawn from the full register. After applying a relevance threshold of impact 3 or higher and likelihood 4 or higher on five-point scales, 10 risks remain. Those 10 risks require 17 controls. The risks that make the cut are instructive: RISK-028 (hallucinated content) scores 4/5 on impact and 5/5 on likelihood because multiple studies confirm LLMs hallucinate frequently on specialised topics. RISK-034 (prompt injection via malicious websites) also scores 4/5 and 5/5 because real-world attacks have been demonstrated without requiring system access. RISK-035 (unreliable information from websites) scores identically because LLMs have been documented presenting satirical content as factual.

The VibeCoder agent, a no-code web app deployment tool with seven capabilities including Code Execution, File and Data Management, and System Management, starts with 48 applicable risks and retains 25 after a more conservative threshold of impact 3 or higher and likelihood 3 or higher. The Replit database incident is cited explicitly in the likelihood rationale for RISK-041, the risk of overwriting or deleting database tables. The rating is 3/5 impact and 4/5 likelihood, relevant even though the staging environment limits consequences, because the failure mode has already materialised in a real product.

The core practical lesson from both examples is that the framework's value is not exhaustiveness. It is structured triage: starting with 38 or 48 risks and arriving at a defensible, auditable shortlist before a line of governance documentation is written.

This changes what a governance team is responsible for. Instead of asking "what could possibly go wrong?", the team asks "which risks from the validated register apply to this system's specific capabilities, and do any score above our risk appetite threshold?" That is an answerable operational question. The blank-page version is not.

What an Agentic Governance Programme Built on ARC Actually Looks Like

The framework proposes a three-step implementation structure. For an organisation moving from ad-hoc per-system reviews to a scalable programme, each step has a distinct function.

Step one is contextualising risks. Developers and system owners score each applicable risk on five-point impact and likelihood scales, with guidance on how to weight domain sensitivity, data classification, system criticality, and deployment context. The Researcher and VibeCoder examples provide direct calibration reference points. This step is where institutional knowledge enters the register: a healthcare deployment of a Researcher-type agent would almost certainly score RISK-025 (unqualified advice in specialised domains) higher than the worked example does.

Step two is establishing relevance thresholds. The organisation sets a minimum impact score and minimum likelihood score that a risk must exceed on both dimensions before requiring explicit mitigation. This threshold is a direct expression of institutional risk appetite. A financial services firm will set it lower than a consumer app studio. What matters is that the threshold is explicit, documented, and consistent across systems so that no developer can simply score risks conservatively to avoid controls.

Step three is scaling the programme. Individual developers work from checklists and declaration forms. A centralised governance team validates and audits those declarations. The Risk Register is updated on a defined cadence as new capabilities emerge, new attacks are documented, and existing controls are found to under-perform against novel threats. The governance team gains a portfolio view of where risk concentrations exist across the organisation's full agentic estate.

The honest constraint is that this programme requires a governance function with enough technical literacy to validate developer declarations, and enough authority to require remediation before deployment. Organisations that have not built that capability for traditional software will not build it automatically by adopting ARC. The framework is the structure; the organisational investment is separate.

Four years ago, the question executives were asking about AI was how to deploy it at scale. The question now is how to govern what is already deployed and what continues to be deployed faster than governance teams can respond. ARC does not slow that deployment down. It gives the governance function a mechanism to stay in the same conversation.

Related Research

agent-governanceai-safetyalignment

Safe Alone, Dangerous Together: The AI Agent Blind Spot

A governance taxonomy organizes AI agent interventions into five categories—alignment, control, visibility, security, and societal integration—to manage risks as agents approach human-level task performance.

IAPS (Institute for AI Policy and Strategy)

May 16, 20269 minRead

agentic-aiprocess-automationprocess-mining

Your Workflow Automation Was Never Designed to Think

Researchers propose a five-layer architectural framework for Agentic BPM Systems (A-BPMS) that combines process mining, AI reasoning, and autonomous orchestration to move enterprise workflows beyond fixed rules into fully self-managing, self-optimizing operations.

University of Tartu

March 17, 20269 minRead

ai-governancerisk-managementconstitutional-ai

Your AI Vendor Has a Governance Problem

A Carnegie Mellon study reveals that Anthropic's Claude fails key transparency, bias, and accountability benchmarks under the NIST AI Risk Management Framework and EU AI Act, exposing significant governance gaps that enterprise buyers must audit before deployment.

Carnegie Mellon University, School of Computer Science, Privacy Engineering

March 17, 202610 minRead