Assessment Framework
A structured three-stage process for deciding whether and how to use AI for a legal task.
This framework helps a lawyer, law student, or legal educator make a structured decision about whether to use AI for a particular task, and if so, which category of tool is appropriate and what safeguards are necessary.
Stage 1: Profile the Task
Before evaluating any tool, assess the task itself across five dimensions. For each, rate the task as presenting Low, Moderate, or High concern.
Does this task require inputting privileged, sensitive, or client-identifying information into the tool?
Low: No client information involved. Task uses only public information or hypotheticals.
Moderate: Task involves client information, but it can be meaningfully anonymized or abstracted before input.
High: Task requires input of privileged communications, PII, PHI, or material that cannot be effectively anonymized.
What are the consequences if the AI output contains errors — and who bears those consequences?
Low: Output is a starting point that will receive substantial human revision. Errors are easily caught.
Moderate: Output will inform a work product or decision but will be reviewed before use. Errors could cause wasted time or misdirection.
High: Output could directly reach a client, tribunal, or opposing party, or could shape a consequential decision with limited opportunity for review.
Who will see the output, and how easily can errors be identified and corrected before causing harm?
Low: Output stays internal. Ample opportunity to review, revise, and discard.
Moderate: Output reaches the client or internal stakeholders. Errors are correctable but may affect trust.
High: Output reaches a tribunal, opposing counsel, or the public. Errors may be difficult or impossible to retract.
Are there specific rules, opinions, court orders, contractual obligations, or statutes that govern or restrict AI use for this task?
Consider: court orders on AI use, jurisdiction-specific ethics opinions, engagement letter restrictions, HIPAA/GDPR/state privacy laws, and evidentiary concerns about AI involvement.
If any applicable constraint prohibits or conditions AI use for this task, that constraint controls regardless of the tool's capabilities.
Who is using the tool, and what review structure exists?
Low: An experienced lawyer using the tool for a task well within their expertise, with established review practices.
Moderate: A junior lawyer or law student using the tool, with supervisory review built into the workflow.
High: Any user operating without meaningful supervisory review, or working outside their area of competence and relying on the tool to compensate.
Composite risk: A task with any dimension rated High should be treated as high-risk overall. A task with multiple Moderate ratings should also be treated with elevated caution.
Stage 2: Evaluate the Tool
Once you understand the task's risk profile, evaluate candidate tools on four dimensions:
Where do inputs go? How long are they retained? Are they used for training? What contractual protections exist? Does the provider offer BAAs, DPAs, or zero-data-retention provisions?
How likely is the tool to produce accurate, well-grounded, non-hallucinated output for this type of task? Does it use RAG grounded in authoritative sources? Does it provide verifiable citations? What is the known error rate?
How easily can a competent lawyer check the tool's output against primary sources or professional judgment? High verifiability means the output includes traceable citations or is of a type the lawyer can assess directly. Low verifiability means the output asserts conclusions without traceable sourcing.
Does the tool's design support or undermine the lawyer's core ethical duties? Map against: Competence (MR 1.1), Confidentiality (MR 1.6), Communication (MR 1.4), Supervision (MR 5.1, 5.3), Candor (MR 3.3), Fees (MR 1.5), and Bias.
Stage 3: Match, Decide, and Mitigate
Cross-reference the task risk profile with the tool evaluation to make a decision and identify necessary safeguards.
Decision Logic
High-risk task + weak tool protections = do not use AI, or use only a TAL/TSL tool with strong protections and rigorous human review.
Moderate-risk task = AI is likely appropriate with the right tool and safeguards. Select a tool whose protections match the specific risks identified.
Low-risk task = broader tool selection is appropriate, but baseline safeguards still apply.
A high rating on any Stage 1 dimension requires that the chosen tool score well on the corresponding Stage 2 dimension. There are no offsets — strong data handling does not compensate for poor output reliability if both dimensions are relevant.
Required Mitigations by Risk Level
Understand the tool's current terms of service and data-handling practices. Review all AI output before relying on or distributing it. Do not treat AI output as a substitute for professional judgment. Maintain awareness of applicable court orders, ethics opinions, and institutional policies.
Anonymize or abstract client information before input where possible. Independently verify all citations, legal conclusions, and factual assertions. Document the AI tool used and its role in the work product. Ensure supervisory review if the user is a junior lawyer or law student.
Use only tools with enterprise-grade or legal-specific data protections. Treat AI output as a preliminary draft requiring full review. Confirm compliance with all applicable court orders, ethics rules, and contractual obligations. Consider whether client communication about AI use is warranted. Assess whether AI-generated work product raises evidentiary concerns.
When Not to Use AI
Some tasks should not involve AI tools at all:
When a court order, ethics rule, contractual obligation, or statute prohibits it. When the task requires input of highly sensitive information and no available tool offers adequate data protections. When the lawyer cannot meaningfully review the output. When the risk of undetectable error exceeds the benefit of AI assistance.