Assessing AI for Criminal Justice

A User Decision Framework

March 2026

Introduction

Criminal justice agencies face urgent questions about the adoption of artificial intelligence (AI), especially concerning the usefulness and safety of existing and forthcoming tools. This framework addresses those challenges by extending AI governance principles into specific operational and ethical contexts of criminal justice practice, translating broad guidance into the detailed actions agencies and practitioners should take to navigate AI adoption responsibly.

Anchored in the Principles for the Use of AI in Criminal Justice produced by the Council on Criminal Justice Task Force on Artificial Intelligence in October 2025, this framework builds on those principles.

Recognizing that systems should be safe and reliable, agencies should require rigorous, independent validation rather than relying solely on vendor claims, particularly for substantial-risk systems where errors could result in wrongful detention or public safety failures.
Procurement serves as a critical safety net: Contracts should establish enforceable performance standards, data rights, fairness requirements, auditability provisions, and termination rights before any system is acquired to ensure confidentiality and security while handling sensitive criminal justice data.
To make AI effective and helpful, multidisciplinary assessment teams—including legal, operational, technical, and community representatives—should evaluate whether systems demonstrably outperform alternatives, with ongoing monitoring and formal reassessments at least annually.
Because AI should be fair and just, regular assessment of impacts across demographic groups is essential, as is mandatory user training that addresses automation bias and ensures operators understand system limitations.
Upholding democratic and accountable deployment requires substantial human oversight. Operators should retain clear authority to override AI-generated recommendations, and community input should be integrated from the outset to ensure that those most affected by these systems help shape their adoption and governance.

While this framework offers guidance that is detailed enough to serve as an action plan, it is not intended to be rigid. Many stakeholders have needs and unique circumstances that warrant nuanced consideration of the recommendations. Users should take the liberties they need to adapt application of the framework to the capacity and limitations of their jurisdiction or organization.

Later in 2026, the Task Force plans to present a series of practical case studies that demonstrate this framework in action across different AI applications and agency contexts. These case studies will serve as implementation playbooks that agencies and communities can use to see how the framework may apply to specific tool categories.

The Council on Criminal Justice Task Force on Artificial Intelligence is a national, nonpartisan initiative to develop standards and evidence-based recommendations to guide the safe, ethical, and effective use of AI in the criminal justice system.

Spanning the four major sectors of the criminal justice system—law enforcement, courts, corrections, and community organizations—the group is producing credible analysis and guidance to help policymakers and practitioners navigate a complex and rapidly evolving landscape in ways that maximize benefits, minimize harms, and improve justice.

Chaired by former Texas Supreme Court Chief Justice Nathan Hecht, the Task Force includes 14 other leaders representing AI technology developers and researchers, police executives and other criminal justice practitioners, civil rights advocates, community leaders, and formerly incarcerated people.

Overview and User Guide

Assessment Workflow

Assessment Tools

Glossary

AI (Artificial Intelligence): Machine-based systems that operate with varying levels of autonomy, may exhibit adaptiveness after deployment, and infer from inputs how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments.

Algorithm: A set of rules or instructions to perform a task or solve a problem; in AI, algorithms process data to produce outputs.

Automation Bias: A tendency to over-rely on algorithmic outputs without sufficient critical evaluation.

Bias (Discriminatory): Unfair discrimination against people based on legally prohibited grounds such as race, gender, national origin, religion, or disability. Such discrimination can occur through disparate treatment or unjustified disparate impact.

Bias (Statistical): Systematic error that causes a model to consistently deviate from accuracy in a particular direction.

Black Box: An AI system whose internal workings are not visible or understandable to users or developers; decisions cannot be traced to specific rules or factors.

Classification Memo: The official document produced through Phase 2 that records an AI system’s risk and opportunity assessment and recommended path forward.

Data Governance: Policies and procedures for managing data quality, security, privacy, and appropriate use throughout a system’s lifecycle.

Demographic Performance: How an AI system performs across different population groups defined by characteristics such as race, gender, age, or socioeconomic status.

Disparate Impact/Discriminatory Effects: Discrimination that occurs when a facially neutral practice disproportionately harms people with a shared identity characteristic, such as race, gender, national origin, religion, or disability, without justification.

Disparate Treatment: Discrimination that occurs by intentionally treating people differently based on legally prohibited grounds such as race, gender, national origin, religion, or disability. (Contrast with disparate impact discrimination, which can be unintentional.)

Due Process: Constitutional requirement for fair legal procedures; AI must not undermine these protections.

Explainability: The degree to which an AI system’s outputs can be explained in terms humans can understand.

Fairness Metric: A quantitative measure of whether an AI system treats different groups equitably; multiple definitions exist and may conflict.

Independent Validation: Testing of an AI system by experts not affiliated with the vendor or implementing agency.

Interpretability: The degree to which a human can understand the cause of an action taken or recommended by an AI system.

Level 1 Requirements: Baseline protections required for all AI systems (see Phase 4, Level 1 Implementation Requirements).

Level 2 Requirements: Enhanced protections required for substantial-risk systems (see Phase 4, Level 2 Enhanced Requirements).

Meaningful Human Oversight: Human review that features information access, sufficient time, training, override authority, documentation, and accountability.

Model Drift: Changes in AI performance over time due to shifts in data patterns.

Training Data: The historical data used to develop an AI system’s predictive model; biases in training data can produce biased outputs.

Transparency: The availability of information about how an AI system works, what data it uses, and how decisions are made.

Vendor: A company or organization that sells or provides an AI system.

Overview

This framework walks stakeholders through sequential phases:

Phase 1: Defining the problem to be solved and assessing organizational readiness
Phase 2: Classifying the system’s risk and opportunity levels
Phase 3: Establishing procurement protections
Phase 4: Implementing with appropriate safeguards
Phase 5: Conducting ongoing monitoring and reassessment

At the end of each phase, you’ll reach a checkpoint, which encourages documented approval before advancement to help ensure that agencies make deliberate choices at every step.

The classification process at the framework’s core first screens for prohibited uses, then categorizes remaining systems by risk level (low or substantial) and opportunity level (substantial or low). These classifications determine whether agencies should proceed with standard deployment, conduct careful implementation with enhanced safeguards, perform further evaluation, or avoid the system entirely.

Ten appendices provide the following tools to support implementation: readiness assessments, prohibited systems protocols, system complexity evaluations, sector-specific guidance, classification memo templates, procurement checklists, implementation planning guides, ongoing monitoring support, and guidance for deployed technology and general-purpose AI tools.

Taken together, these resources translate the Task Force’s principles into:

Constitutional and due process considerations tailored specifically to criminal justice applications;
Concrete procurement and implementation steps that address the practical realities of agency operations; and
Checklists and templates that can be adapted to jurisdictions’ needs.

A Call for Critical Thinking

This framework provides a structured pathway for critical engagement with the evaluation, oversight, and use of AI in the criminal justice system. The questions, tables, and examples are meant to be illustrative guides, not inflexible decision trees. To enhance the value and validity of insights drawn from this framework, users should:

Engage thoughtfully with difficult questions about fairness, bias, and constitutional compliance
Challenge assumptions about what technology can and should do in justice settings
Be prepared to say no if safeguards cannot be adequately implemented
Leverage domain expertise, legal obligations, and contextual nuance throughout the process

User Guide

This framework is designed for use specifically with AI systems as opposed to other common forms of technology or software. The boundaries between AI and other technologies can be blurry and ambiguous. For this framework, “AI” refers to automated systems that generate predictions, recommendations, classifications, decisions, actions, or content that influence actions and decisions. It does not cover basic procedural technologies (e.g., spreadsheets, databases, standard word processing).

This framework assumes that you already have clarity on whether the tool in question should be classified as AI that could substantially influence decisions, or that you have the knowledge or external support necessary to make such a determination.

How to use this framework depends on your current engagement with potential or actual AI solutions:

If you are considering a specific AI system, start at Phase 1 and proceed sequentially through the framework to evaluate that system’s characteristics, risks, and opportunities before making procurement and implementation decisions.
If you already use an AI system, review the full framework, then consult Appendix I, Guidance for Deployed AI Systems, to evaluate your tools and determine the proper course for ongoing management and oversight.
If you are conducting an open procurement process without a specific vendor in mind, determine whether your agency should pursue this category of tool at all (using Phase 1, Phase 2, and Appendix D). If you decide to proceed, assess each finalist proposal using the risk and opportunity frameworks before making your selection. Complete a classification memo (Appendix E) for your chosen vendor before finalizing the contract.
If you are exploring whether AI might help with a problem but have no specific tools in mind, begin with Phase 1 and the sector context guidance in Appendix D to understand how AI might intersect with your current practices. You should also consider whether the problems you’re trying to solve might be better addressed through policy changes, training, or increased resources, rather than through technology. Proceed to exploring specific tools if your workplace has built a strong foundation for responsible AI use.

Developing Policies for General-Purpose AI Tools in Criminal Justice Settings

This framework is primarily designed for evaluating purpose-built AI systems acquired through formal procurement. However, AI increasingly enters criminal justice settings through a different pathway: staff use of general-purpose AI tools such as AI chatbots, agents, coding tools, and document analysis systems that are not purchased specifically for criminal justice work but are used in case-related contexts.

These general-purpose tools present distinct governance challenges. They are often adopted informally, without IT oversight or contractual protections. They may process sensitive case data through external servers. Their capabilities change frequently as providers update their models. And because they are intended for general-purpose use, they can be applied to an open-ended range of tasks without any single procurement decision triggering review.

Agencies should not ignore this reality. Instead, they should develop a policy governing staff use of general-purpose AI tools for case-related work. See Appendix J for more on this problem and the Task Force’s recommended action.

User Profiles

The following user profiles offer illustrative, though not exhaustive, overviews of stakeholder groups that may find this framework useful, as well as tailored procedural guidance for each group:

User Group	How to Use This Framework
Agency Leaders (chiefs, directors, sheriffs, court administrators, corrections commissioners)	Focus on Phase 1 (foundation and readiness), Phase 2 (understand classification decisions), and the Introduction (principles). You approve progression past Phase 1, accept classification memos, authorize procurement for substantial-risk systems, and approve deployment after pilots.
Procurement & Legal Officials (general counsel, procurement directors, contract officers, county attorneys)	Focus on Phase 3 and Appendix F (contract protections), plus the prohibited use screening in Phase 2. You approve contract language, certify legal compliance, advise on constitutional concerns, and can recommend rejection based on legal risk.
Project Managers & IT Staff (IT directors, system administrators, data officers)	Focus on Phase 4 (implementation), Phase 5 (ongoing management), and Appendix G (implementation template). You recommend technical feasibility, approve integration plans, certify training completion, flag technical concerns during pilots, and recommend continuation or termination based on performance.
Policymakers (legislators, council members, commission staff, oversight bodies)	Review the full framework to inform legislation and oversight. Focus on Phase 2 (risk categories for regulatory frameworks) and the Introduction (specific criminal justice AI governance). You set mandatory requirements, establish reporting and oversight mechanisms, and allocate resources for AI governance.
Community Representatives & Advocates (public defenders, civil rights organizations, community advisory members, crime survivors, directly impacted individuals)	Focus on the prohibited use screening (Phase 2) and community engagement requirements (Phase 4, Level 2). You raise concerns through advisory processes, provide input on risk and opportunity assessments, advocate for specific safeguards, and escalate rights violations.
AI Developers & Vendors (technology companies, product designers, AI researchers, vendors seeking to develop or sell AI tools for criminal justice settings)	Review the sections—particularly classification (Phase 2), procurement (Phase 3), and implementation (Phase 4)—that can help you anticipate the questions, safeguards, and documentation stakeholders may expect before adopting an AI system. Understanding these expectations may help you design tools and documentation that better align with criminal justice system priorities around validation, transparency, fairness, and meaningful human oversight.

Questions for Future Work

This framework provides a structured pathway for responsible AI adoption in criminal justice, but frameworks alone are not sufficient to guarantee good outcomes. Important questions remain about what infrastructure is needed to make the full recommendations embedded in this framework accessible and considerate of the ways that AI uses may evolve. The Task Force believes the following institutional design questions highlight areas for potential future work by this body, successor entities, policymakers, or other stakeholders:

Agency capacity: What minimum internal resources and expertise should agencies possess before pursuing AI adoption? How can smaller agencies access independent technical and legal expertise?
Bundled support: Should professional associations, states, or regions establish centralized review bodies, shared services, pre-approved vendor lists, or pooled technical assistance to reduce the burden on individual agencies?
Federal role: What guidance, standards, or grant incentives from federal agencies would support responsible local adoption of AI technology?
Technical assistance: Who should provide implementation support—state agencies, academic institutions, nonprofits, or professional associations? And how should it be structured?
Accountability mechanisms: What type and scope of external oversight from courts, legislatures, or civil society can reinforce these best practices?
Future AI capabilities: How should criminal justice institutions and stakeholders prepare for a possible future in which AI becomes a general-purpose capability that matches or exceeds human performance across a wide range of cognitive work and professional functions?

Assessment Workflow

Phase 1: Foundation and Readiness

1. Define the Problem

Before evaluating any AI tool, you should clearly define the problem you’re trying to solve. Technology should not be a solution looking for a problem. Complete this exercise first:

Problem: What specific criminal justice problem are we trying to solve?
- Be specific and measurable
- Who experiences this problem? How does it affect them?
- How long has this problem existed? What solutions have been tried already?
Theory of Change: How exactly would this AI tool solve the problem better than alternatives?
- What is the causal mechanism by which AI improves outcomes?
- Why wouldn’t a non-AI solution work as well?
- What assumptions must be true for the AI solution to work?
Success Metrics: How will we measure whether the problem is actually solved or improved?
- What data will we track?
- What magnitude of improvement would justify the investment?
- Over what timeframe will we evaluate success?

2. Assess Organizational Readiness

Adopting AI is a policy decision with implications for justice, safety, and public trust. Before proceeding, use the AI Readiness Assessment Worksheet (see Appendix A) to evaluate your organization’s capacity in areas like data governance, technical expertise, legal frameworks, and community engagement. If this assessment reveals significant gaps, they should be addressed before proceeding with AI deployment.

Phase 1 Complete Checklist

Requirement	Complete?
Problem Definition
Problem statement is documented in specific, measurable terms
Theory of change explains why AI may be preferable to alternatives
Success metrics are defined and measurable
Relevant stakeholders have reviewed the problem scoping
Organizational Readiness
AI Readiness Assessment (Appendix A) is complete
Adequate capacity is confirmed, or remediation plans are prepared for capacity gaps
Leadership has reviewed and acknowledged readiness status

Phase 1 Checkpoint

If you cannot clearly articulate the problem or why AI is preferable to alternatives, or if significant capacity gaps exist without clear remediation plans, PAUSE. Revisit the problem definition, consider non-AI alternatives, or build foundational capacity before proceeding.

Phase 2: Classification

The goal of this phase is to produce a Classification Memo (see Appendix E) that analyzes the AI system and recommends a path forward.

1. Assemble Your Assessment Team

Assessing the risk and opportunity associated with an AI use case requires a well-rounded set of expertise. To help increase the likelihood of accurate assessments, teams should include the experts listed below.

Recommended for All Systems: An operational leader, a legal/constitutional expert, and end-users. Establish clear rules for making decisions.
Add for Substantial-Risk Systems: A sector specialist, community representatives, and (for complex systems) a technical expert.

2. Screen for Prohibited Uses

Some AI applications pose an unacceptable risk to fundamental rights and should be prohibited in the criminal justice context unless the risks can be adequately mitigated or the problematic features can be eliminated. Answer the screening questions below.

If you answer YES to ANY question, the system raises serious concerns that should be resolved before deployment. If adequate mitigation is not possible, the system is prohibited. Proceed directly to Appendix B: Protocol for Prohibited Systems.
If you answer NO to all questions, proceed to the next step.

Prohibited Use Screener

Question	Yes	No
Does the system make autonomous decisions about liberty (e.g., detention, sentencing) without the possibility of substantial human review?
Does the system eliminate or impair a person’s right to contest a pending decision, or appeal a decision that’s already been made affecting their rights?
Does the system circumvent or undermine established legal or constitutional protections (e.g., due process, equal protection)?
Does the system perform individualized tracking and surveillance of or otherwise have a chilling effect on a group engaging in lawful, constitutionally protected activities (e.g., First Amendment-related activities)?
Does the system target or select people based on protected characteristics and create unjustified discriminatory effects because of race, gender, religion, national origin, disability, or another legally prohibited ground?
Does the system systematically undermine human dignity (e.g., by publicly shaming or humiliating people or stripping them of all agency)?

If you are uncertain how to answer these questions, you should:

Request documentation from the vendor. Vendors should be able to answer all of these questions clearly.
Consult outside legal and technical experts for advice on these questions

Assessing Mitigation Possibilities

If you answered “yes” to any question, consider whether the concern can be addressed through:

Design modifications that eliminate the problematic feature
Procedural safeguards that adequately protect rights
Technical controls that prevent the harmful application
Alternative deployment that avoids the prohibited use

Only proceed with deployment if you can document that the risk has been fully mitigated or the problematic feature eliminated. If mitigation is not possible or adequate, the system should be considered prohibited.

3. Assess System Complexity and Interpretability

Next, you should evaluate the system’s technical characteristics. More complex, opaque (“black box”) systems require more scrutiny and more robust safeguards.

Use the questions in the System Complexity and Interpretability Assessment (see Appendix C) to determine if the system is transparent and predictable or opaque and ambiguous. Document your findings; this work will inform the oversight required for implementation.

Note on AI system type: Your System Complexity and Interpretability Assessment should influence the risk assessment outlined below in step five. A system that is difficult to interpret or validate may warrant a higher risk classification, as errors may be harder to detect or mitigate.

4. Consider the Sector Context

Before you assign a general risk or opportunity score, analyze how the AI tool will change your current practices. An AI system does not exist in a vacuum; its impact is relative to the baseline with which it interacts. A high-stakes context doesn’t automatically mean AI increases risk. Rather, the core question is whether AI makes the existing process more or less risky, fair, and effective.

Consider the legal and operational context of your specific sector. Review the Sector Context Guidance (see Appendix D) to help you frame your thinking for the next steps.

5. Determine Risk and Opportunity Levels

With the sector context in mind, use the following tables to classify the system’s risk and opportunity levels.

Note on classification: The risk and opportunity levels offered in this section are not meant to imply that all AI systems and use cases fall cleanly into a binary. The “low” and “substantial” classification designations are designed to emphasize selectivity when determining risk and opportunity. For example, on a scale of 1 to 10, a substantial risk classification could correlate with levels 4 through 10 (as opposed to the traditional 6 through 10 in a true binary). This might encourage stakeholders to use the enhanced safeguards outlined in this framework even with AI systems and uses that may seem only moderately risky.

Risk Assessment

Use this table to determine if the risk level is low or substantial.

Risk Level	Liberty Impact	Rights Impact	Error Consequences
LOW	Unlikely. No direct effect on an individual’s liberty.	Unlikely. Does not affect procedural or substantive legal rights.	Errors cause minimal harm and are easily corrected.
SUBSTANTIAL	Moderate-High. Affects or influences stop, search, arrest, detention, bail, charging, plea, sentencing, parole, clemency, or similar decisions.	Affects procedural or substantive legal rights. Involves surveillance or processes sensitive personal data.	Significant harm possible. Errors could lead to wrongful detention or rights violations.

Risk Classification Questions

If the answer is YES to any of the following, the system is likely SUBSTANTIAL RISK:

Does it influence stop, search, arrest, pretrial release or detention, charging, plea, sentencing, parole, clemency, or similar decisions?
Does it implicate legal rights, such as freedom of speech, the right to be free from unreasonable searches or seizures, the right against self-incrimination, the right to counsel, the right to confront witnesses, the right to a fair and impartial jury, or the right to be free from discrimination?
Does it involve the surveillance or monitoring of individuals?
Does it process sensitive personal data?
Could it create an unjustified disparate impact?
Does it directly affect access to programs, services, or due process?
Could errors result in wrongful detention or rights violations?
Does it limit the ability for people to contest decisions?

If the answer is NO to all of the above, the system is likely LOW RISK.

Opportunity Assessment

Use this table to classify the potential for positive impact as substantial or low.

Opportunity Level	Performance Improvement	Evidence Quality	Stakeholder Support	Cost-Effectiveness
LOW	Demonstrable improvement over existing processes.	Supported by evidence from pilots or independent research.	Community and end-users validate the value.	Favorable.
SUBSTANTIAL	Minimal, uncertain, or no improvement.	Little or no supporting evidence; claims are speculative.	Stakeholders skeptical of necessity.	Does not favor AI.

The four factors key to evaluating the potential for positive impact will not always align. When they conflict, consider:

Evidence quality. Strong performance claims mean less without credible validation. Be skeptical of promised improvements that lack independent evidence.
Stakeholder concerns warrant serious weight. Opposition from affected communities or end-users can predict implementation problems, even when other factors look favorable.
Improvement should benefit those affected. Efficiency gains that accrue to the organization while individuals bear the risks (errors, bias, privacy loss) represent weaker opportunity than improvements in actual outcomes.
Compare to alternatives, not to nothing. The relevant question is whether AI outperforms the best alternative use of the same resources.

When factors point in different directions, use your team’s collective judgment to determine the final opportunity score. In your Classification Memo, document which factors you weighted most heavily and why.

6. Finalize and Document Classification Decision

The goal of this phase is to complete and file the Classification Memo that summarizes your assessment and establishes an official record of your findings.

Use your risk and opportunity levels to find your position on this matrix.

	Substantial Opportunity	Low Opportunity
Substantial Risk	CAREFUL IMPLEMENTATION: Potential value, but also significant risks. Recommended action: Proceed only by applying ALL Level 1 AND Level 2 requirements (details in Phase 4). Agency head or designated authority should provide written approval before deployment, documenting that all safeguards are in place.	GENERALLY AVOID: Strong presumption against implementation. High risks are not justified by low benefits. Recommended action: Do not proceed without considering non-AI alternatives.
Low Risk	STANDARD DEPLOY: These systems offer clear benefits with lesser risk. Recommended action: Proceed with Level 1 requirements (details in Phase 4).	EVALUATE: The benefits are unclear and may not be worth the investment. Recommended action: Conduct a careful cost-benefit analysis. If proceeding, use Level 1 requirements (details in Phase 4).

Phase 2 Completion Checklist

Before proceeding to Phase 3, confirm the following:

Requirement	Complete?
Assessment team properly constituted
Prohibited use screening complete (all NO)
System complexity assessment complete (Appendix C)
Sector context considered (Appendix D)
Risk level determined and documented
Opportunity level determined and documented
Classification Memo (Appendix E) complete
Classification Memo has required approval

Phase 2 Checkpoint

If classification is GENERALLY AVOID: Should not proceed without completing alternatives assessment and documented justification for proceeding despite low opportunity.
If classification is EVALUATE: Should proceed only after completing rigorous cost-benefit analysis that justifies investment.
If classification is STANDARD DEPLOY or CAREFUL IMPLEMENTATION: Proceed to Phase 3.

Required approval: The Classification Memo should be approved by the designated authority before procurement begins. For substantial-risk systems, this should be the agency head or a person of equivalent authority.

Phase 3: Procurement

The procurement phase establishes the contractual foundation that protects your agency, ensures accountability, and maintains compliance throughout the system’s lifecycle. If your recommendation is to proceed, you should now navigate this process mindful of the appropriate requirements based on your system’s classification.

The following foundational steps should be completed:

1. Budget and Resource Confirmation

Plan for resources needed throughout the system’s lifecycle for:
- Acquisition
- Initial implementation and integration
- Staff training and change management
- Ongoing monitoring and oversight
- Technical fixes and improvements
- Community engagement processes (for substantial-risk systems)

2. Designate Personnel

Designate a procurement lead with appropriate authority
Include representatives from:
- Legal/general counsel
- End-user departments
- IT/technical staff
- Finance/budget office
- For substantial-risk systems, add: community representatives or liaison, independent technical expert (if system is complex)

3. Contract Negotiation and Essential Terms

Contract negotiation should secure protections and requirements based on your system’s classification. Use Procurement Guide (see Appendix F) as your checklist and complete all necessary steps before moving to Phase 4.

Phase 3 Completion Checklist

Before proceeding to Phase 4, confirm the following:

Requirement	Complete?
Resources considered for full lifecycle (not just acquisition)
Procurement team properly constituted
Contract includes all required Level 1 terms (Appendix F)
If substantial risk, contract includes all Level 2 terms
Contract fully executed
If substantial risk, community engagement plan developed
If substantial risk, independent oversight established

Phase 3 Checkpoint

Should not begin implementation until the contract is fully executed with all required protections. A contract missing essential terms creates unacceptable risk. If the vendor will not agree to the required terms, you should return to market or reconsider whether AI is appropriate for this use case.

Phase 4: Implementation

Implementation is where theoretical safeguards and contractual promises become operational reality. This phase of the framework encourages you to pay careful attention to how the system will function in your environment and how you’ll ensure it performs as intended. Before any system goes live, completion of the Implementation Memorandum Template (see Appendix G) is recommended. This document serves as your operational blueprint, translating classification decisions and contract terms into concrete implementation steps.

Level 1 Implementation Requirements (For All Systems)

These baseline recommendations apply to every AI system deployed in criminal justice settings.

Human-Centered Design

The deployment of AI tools should account for the realities of human work and organizational design, and should allow human supervisors to retain ultimate authority. This requires deliberate design choices about how the system presents information and how override decisions are captured and reviewed.

Design interfaces to present AI outputs as recommendations and show operators the relevant information behind each recommendation.
Document every override decision along with the operator’s rationale.

Training and Capacity Building

Before anyone operates the AI system, they should receive training that covers system functionality, limitations, known failure modes, and automation bias.

Plan for regular refresher training, updates whenever the system changes, and sessions incorporating findings from ongoing monitoring.
Keep detailed records of who has been trained, maintain current training materials, and evaluate training effectiveness.

Creating Transparency and Accountability

Members of the public should know when AI systems affect them. Develop clear public notification materials in plain language explaining what the system does, how it’s used, how much it costs to operate and maintain, and who is accountable for system-related decisions.

Internally, designate specific officials who are publicly accountable for the system.
Document decision-making processes and override rationales.
Publish regular reports on system performance, including how often operators override the system and why.

Protecting Privacy and Data Security

Collect and process only what is necessary for the system’s legitimate purpose. Regular reviews of these requirements should be conducted to prevent unnecessary accumulation. Data collected for one use should not be repurposed without new assessment and approval.

Implement technical safeguards per your contract, including regular security assessments, strict access controls, and monitoring for unauthorized access.
Establish and follow clear retention policies with automated deletion where appropriate.

Pilot Program

Before it is deployed at full scale, the system should be tested in your actual operational environment. Design a pilot with a controlled scope and clearly defined success criteria before you begin. Low-risk systems may proceed with pilot duration at agency discretion, though sufficient time should be allowed to substantially evaluate success criteria. Compare the AI system’s performance to your baseline practice and include diverse cases representing the full range of scenarios you’ll encounter.

During the pilot, systematically track your success metrics while remaining alert for unexpected behaviors. Gather user feedback and document issues and resolutions.
Evaluate results against your success criteria: Did the AI improve on current practice? Are there rights concerns or disparate impacts? How well does it fit into operations? Is it cost-effective?
Make a formal go/no-go decision. If the pilot reveals constitutional violations or unmitigable harms, you cannot proceed and should shift to contract termination procedures. If moving forward, incorporate lessons learned.

Ongoing Monitoring and Oversight

System performance should be continuously monitored; watch for degradation over time and compare results to baseline and pilot performance. Track error rates and types to identify concerning patterns.

Pay attention to performance across demographic groups, investigating any differences that emerge.
Establish procedures for identifying problems, addressing them, and documenting issues and resolutions.
Share what you’ve learned across your organization.

Additional Considerations for High-Complexity Systems

If Phase 2 identified your system as highly complex or opaque, enhanced safeguards should be added.

Require vendor-provided explanation mechanisms that generate case-specific rationales in plain language.
Require vendor training, repairs, and updates for the life of the contract.
Conduct more extensive pre-deployment testing and obtain independent technical validation.
Continue validation during operation, checking regularly for model drift or unexpected behaviors.

Level 2 Enhanced Requirements (For Substantial-Risk Systems)

Implementation of systems classified as substantial-risk should follow everything outlined in Level 1 and include a set of enhanced protections detailed below.

Strengthening Rights Protection

Conduct a formal legal analysis that reviews due process implications, equal protection and other discrimination concerns, and surveillance or privacy impacts. Complete a thorough human rights impact assessment, identifying potential impacts and developing mitigation strategies. Ensure individual rights remain fully preserved through clear procedures that allow people to:

Challenge AI-influenced decisions
Access meaningful information about how AI was used in their case
Present additional or new information

Community Engagement

Establish a Community Advisory Committee with representatives from affected communities who have genuine authority to raise concerns and recommend changes. Consider appropriate compensation for committee members.
Maintain ongoing communication through regular public reports and accessible forums that address questions and concerns.
Document how community concerns are addressed, incorporate feedback into improvements, and report back to the community on actions taken.
Acknowledge historical injustices and provide multiple accessible feedback channels. Within operational realities, seek input before decisions are finalized, and make a particular effort to include crime victims and people directly affected by the criminal justice system, wherever possible.

Demanding Evidence, Validation, and Disclosure

Require independent validation by experts not affiliated with your vendor or agency. They should test on representative data, compare the system to alternative approaches, and publicly share results (with appropriate confidentiality protections).
Conduct a formal alternatives analysis examining non-AI approaches and documenting why AI is preferable despite its risks.

Auditing and Enhanced Oversight

Extend training on system limitations and automation bias resistance. Establish continuing education requirements.
Protect staff through clear whistleblower safeguards, formal channels for reporting concerns, and investigation protocols that take reports seriously.
Implement real-time analysis that continuously monitors performance. Establish protocols for triggering investigation and take corrective action.
Maintain documentation, including detailed decision logs, complete audit trails for all system use, and records of every override that include rationale. Ensure records are accessible for legal review and appeals.

Additional Technical Safeguards for High-Complexity Systems

When substantial risk combines with high complexity, implement adaptive system monitoring that continuously watches for unexpected behaviors and produces immediate alerts for anomalies.
Ensure you can freeze deployment if concerns arise, and plan for regular revalidation as the system adapts.

Phase 4 Completion Checklist

Before transitioning to ongoing management, confirm the following:

Requirement	Complete?
Implementation Memorandum (Appendix G) is complete
All Level 1 safeguards are operational
If Substantial Risk, all Level 2 safeguards are operational
All required training completed and documented
Pilot program completed with documented results
Formal go/no-go decision made based on pilot
Monitoring systems operational and producing data
Public notification completed
System performing as expected
No unmitigated constitutional or rights concerns

Phase 4 Checkpoint

If pilot reveals constitutional violations or unmitigable harms: DO NOT DEPLOY. Follow contract termination procedures.
If pilot reveals concerns that can be mitigated: Address concerns, document mitigations, and obtain approval before deployment.
If pilot is successful: Proceed to full deployment with all safeguards active.

Phase 5: Ongoing Management and Reassessment

Deployment is not the end of the process. All systems need ongoing monitoring and periodic reassessment to ensure they continue to function as intended without causing undue negative outcomes.

Scheduled Reassessment: Substantial-risk systems should be fully reassessed annually, and low-risk systems should be reassessed before contract renewal.
Triggered Reassessment: A new, full assessment should be conducted immediately if there are significant capabilities changes or major system updates, the system is applied to a new use case, performance issues arise, better alternatives become available, or new rights concerns surface.

See Appendix H: Ongoing Monitoring and Reassessment for detailed protocols.

Assessment Tools

Acknowledgments

This assessment framework is from the Council on Criminal Justice Task Force on Artificial Intelligence, a national, nonpartisan initiative developing standards and evidence-based recommendations to guide the safe, ethical, and effective use of AI in the criminal justice system. The framework is a product of its members, who graciously shared their time and expertise.

Jesse Rothman produced this report, with support from Cameryn Farrow, Olivia McLarnan, Andrew Page, and others from the Council on Criminal Justice team.

James Anderson and RAND serve as research partners to the Task Force.

The Task Force is grateful to Kyle Moore for leading the external feedback process, as well as advisers Sorelle Friedler, Kevin Miller, Judge Scott U. Schlegel, Jonathan Wroblewski, and many others across the criminal justice field for providing invaluable guidance and insights during the production of this framework.

Support for the Task Force on Artificial Intelligence comes from the Heising-Simons Foundation, The Just Trust, Microsoft, Southern Company Foundation, and The Tow Foundation, as well as the John D. and Catherine T. MacArthur Foundation and other CCJ general operating contributors.

Suggested Citation

Council on Criminal Justice. (2026). Assessing AI for criminal justice: A user decision framework. https://counciloncj.org/assessing-ai-for-criminal-justice-a-user-decision-framework/

Print this page

Assessing AI for Criminal Justice

A User Decision Framework

Introduction

Table of Contents

Overview and User Guide

Assessment Workflow

Assessment Tools

Glossary

Overview

A Call for Critical Thinking

User Guide

Developing Policies for General-Purpose AI Tools in Criminal Justice Settings

User Profiles

Questions for Future Work

Assessment Workflow

Phase 1: Foundation and Readiness

1. Define the Problem

2. Assess Organizational Readiness

Phase 1 Complete Checklist

Phase 1 Checkpoint

Phase 2: Classification

1. Assemble Your Assessment Team

2. Screen for Prohibited Uses

Prohibited Use Screener

Assessing Mitigation Possibilities

3. Assess System Complexity and Interpretability

4. Consider the Sector Context

5. Determine Risk and Opportunity Levels

Risk Assessment

Risk Classification Questions

Opportunity Assessment

6. Finalize and Document Classification Decision

Phase 2 Completion Checklist

Phase 2 Checkpoint

Phase 3: Procurement

1. Budget and Resource Confirmation

2. Designate Personnel

3. Contract Negotiation and Essential Terms

Phase 3 Completion Checklist

Phase 3 Checkpoint

Phase 4: Implementation

Level 1 Implementation Requirements (For All Systems)

Human-Centered Design

Training and Capacity Building

Creating Transparency and Accountability

Protecting Privacy and Data Security

Pilot Program

Ongoing Monitoring and Oversight

Additional Considerations for High-Complexity Systems

Level 2 Enhanced Requirements (For Substantial-Risk Systems)

Strengthening Rights Protection

Community Engagement

Demanding Evidence, Validation, and Disclosure

Auditing and Enhanced Oversight

Additional Technical Safeguards for High-Complexity Systems

Phase 4 Completion Checklist

Phase 4 Checkpoint

Phase 5: Ongoing Management and Reassessment​

Assessment Tools

Acknowledgments

Suggested Citation

Recent Posts

National Task Force Releases New Framework to Help Criminal Justice Agencies Assess AI Tools

Assessing AI for Criminal Justice: A User Decision Framework

UpClose With Vikrant Reddy

Phase 5: Ongoing Management and Reassessment