How do you go from zero to 10x with AI as a therapist?
A Competency-Based AI Integration Curriculum for Clinical Practice
"The question is no longer whether AI will transform mental health care. It is whether clinicians will lead that transformation or be swept along by it."
Full Disclosure: this article is part of a general experiment I've been trying to figure out how to use AI better so I wrote this in conjunction with the Claude LLM. All feedback, including whether or not this is useful, too artificial sounding, etc. is welcome.
Welcome: Why This Curriculum Exists
Mental health care changed structurally between 2024 and 2026. The global mental health AI market reached USD 8 billion, growing at a 20.64% compound annual growth rate 1. Medicare began reimbursing FDA-authorized digital therapeutics 2. A scoping review in JMIR Mental Health catalogued over 60 published studies on ChatGPT in clinical and psychoeducational contexts alone 4. And 40% of psychologists reported plans to integrate AI into their practice by 2026, most without any formal training to do so 3.
That last statistic is the reason this curriculum exists.
You did not go through years of graduate training, supervised practice, and licensure so that a technology vendor could define what ethical AI integration looks like in your therapeutic space. You went through that training because the work matters. People in distress deserve skilled, ethical, human care. This curriculum is built on a single conviction: AI tools should make you a more powerful clinician, not a more compliant one.
This is not a technology course. It is a clinical competency course about how thoughtful practitioners integrate emerging tools into an already-rigorous professional framework. You will leave with practical skills, a critical lens, and an ethical spine strong enough to hold up under pressure from administrators, vendors, and the hype cycle alike.
Who this is for: Licensed or pre-licensed mental health clinicians (psychologists, licensed counselors, social workers, marriage and family therapists, and psychiatrists) at any stage of career who want to engage AI with both competence and conscience.
What "0 to 10x" means: Zero means you have never deliberately used AI as a clinical tool. 10x means your effectiveness has multiplied across reach, insight, documentation efficiency, and continuing education. Not replaced. Multiplied.
The multiplier maps to the four competency levels you will progress through:
| Competency Level | Multiplier Range | What It Looks Like |
|---|---|---|
| Level 1 (Informed) | 0 → 1x | You understand the tools and evidence. You can explain AI's role to a colleague or patient. |
| Level 2 (Operational) | 1x → 3x | You use AI safely in defined workflows. Documentation takes half the time. You have a prompt library. |
| Level 3 (Critical) | 3x → 6x | You evaluate, adapt, and push back on tools. You audit vendor claims. Your workflow is intentional, not reactive. |
| Level 4 (Leadership) | 6x → 10x | You train others, shape policy, and contribute to the field. Your practice is a reference model. |
Track your own progress: at the start of each module, identify where you fall on this scale. The gap between where you are and where you want to be tells you which sections to spend the most time on.
How long it takes: 16 weeks, organized into six progressive modules. Each week requires approximately 3–5 hours of engagement. The curriculum is designed for self-paced cohort learning but functions equally well as individual study.
Curriculum Architecture
MODULE 1 Foundations (Weeks 1–2)
MODULE 2 Clinical Prompt Engineering (Weeks 3–5)
MODULE 3 AI in the Clinical Workflow (Weeks 6–8)
MODULE 4 Assessment, Diagnosis, and Evidence Review (Weeks 9–11)
MODULE 5 Ethics, Risk, and Governance (Weeks 12–14)
MODULE 6 Advanced Integration and Future Practice (Weeks 15–16)
Competency Levels:
- Level 1 (Informed): You understand what AI tools are and what the evidence says
- Level 2 (Operational): You can use tools safely inside a defined workflow
- Level 3 (Critical): You evaluate, adapt, and push back on tools and vendor claims
- Level 4 (Leadership): You train others, shape policy, and contribute to the field
Each module moves you along this continuum.
Module 1: Foundations
Weeks 1–2 | Target Competency Level: Informed
1.1 What Has Actually Changed by 2026
Let me be direct with you about what is new and what is not.
What is genuinely new is the quality and accessibility of large language models (LLMs). Until roughly 2023, AI tools marketed to mental health clinicians were primarily app-based interventions: static chatbots following decision trees, mood tracking applications, and guided CBT programs with rigid logic. These tools had modest but real evidence bases [7]. They also had hard ceilings: they could not engage flexibly with novel clinical presentations, could not generate nuanced psychoeducation on demand, and could not assist clinicians in the way a knowledgeable colleague might.
LLMs changed that ceiling. Models like GPT-4, Claude, and Gemini can hold contextually coherent conversations over many exchanges, synthesize clinical literature on demand, assist with documentation, generate patient-facing materials, support clinical reasoning, and serve as tireless research assistants. The research and investment communities have recognized this: LLMs have substantially displaced app-based interventions as the primary focus of mental health AI development 4.
What is not new is the ethical challenge. Every tension you will encounter (privacy, equity, the therapeutic relationship, scope of practice, informed consent) has precedents in prior technology debates. Telehealth created most of these conversations. AI sharpens them.
Key Definitional Framework: The Three Tiers of Mental Health AI
| Tier | What It Is | Examples | Regulatory Status |
|---|---|---|---|
| Tier 1: Digital Therapeutics (DTx) | Software as medical device; FDA-regulated; evidence-based | Freespira, Rejoyn | FDA-authorized; Medicare reimbursable [2] |
| Tier 2: Clinical Decision Support | AI tools assisting clinician judgment; not autonomous | Diagnostic assistants, risk stratification | Varies; clinician remains decision-maker |
| Tier 3: General-Purpose LLMs | Not built for clinical use; used by clinicians anyway | ChatGPT, Claude, Gemini | No clinical regulation; highest risk of misapplication |
Most of what you will encounter in your practice, and most of what this curriculum covers, lives in Tier 3. That is both the opportunity and the responsibility.
The Medicare Reimbursement Inflection Point
Medicare's move to reimburse FDA-authorized digital therapeutics 2 represents a structural change in how AI-adjacent tools enter clinical practice. It creates financial incentive for adoption, which accelerates both deployment and the risk of premature adoption. Understanding which tools have regulatory authorization, which have peer-reviewed evidence, and which have neither is now a core clinical competency.
ETHICAL CHECKPOINT 1.1
Before proceeding, write a one-paragraph answer to this question in your learning journal: What is the boundary between a tool that assists my clinical judgment and a tool that replaces it? You will return to this answer at the end of the curriculum and revise it. The evolution of your thinking is itself a data point.
1.2 How LLMs Work: What Every Clinician Must Know
You do not need to understand transformer architecture. You do need to understand the five properties of LLMs that directly affect clinical use.
Property 1: Stochastic Output LLMs generate probabilistic text. Given the same input, they do not reliably produce the same output. This is unlike a diagnostic checklist, a validated scale, or a treatment protocol. Clinical applications requiring reproducibility must account for this.
Property 2: Training Cutoff LLMs are trained on data up to a cutoff date. They do not know what happened after that date unless given tools that access live information (retrieval-augmented generation, or RAG). The DSM-5-TR, recent FDA guidance, and emerging treatment protocols may postdate a model's training.
Property 3: Hallucination LLMs generate plausible-sounding text that may be factually incorrect. In clinical contexts, a hallucinated citation, an incorrectly described medication interaction, or a fabricated statistic is not merely embarrassing. It is potentially harmful. Verification is non-negotiable.
Property 4: Context Window Dependency LLMs process only what is in the current conversation. They do not have access to your prior sessions with a patient, your clinical intuition built over years, or the nonverbal data that fills your consulting room. They are, in a meaningful sense, radically decontextualized.
Property 5: Sycophancy LLMs are trained with human feedback mechanisms that reward agreement and approval. They tend to validate the framing of questions rather than challenge it. This means a poorly framed clinical question tends to produce a confidently wrong answer rather than a productive challenge.
The VERA-MH Evaluation Framework
The VERA-MH framework 6 was developed specifically to evaluate AI chatbots for mental health applications. Its five dimensions provide a practical vocabulary for assessing any AI tool you encounter.
A note on frameworks and conflicts of interest: VERA-MH was developed by a for-profit organization, which is worth bearing in mind when evaluating its scope and emphasis. Other evaluation frameworks exist, including FAITA-MH (Framework for AI Trustworthiness Assessment in Mental Health) and emerging academic evaluation rubrics. No single framework is complete. Using multiple lenses gives you a more honest picture of a tool's fitness for clinical use. This curriculum uses VERA-MH as a primary scaffold because its dimensions map well to clinical reasoning, but you should supplement it with other frameworks as the field matures.
The five VERA-MH dimensions:
- V — Validity: Does the tool produce clinically accurate outputs?
- E — Engagement: Does the tool maintain therapeutic alliance and appropriate responsiveness?
- R — Risk Management: Does the tool appropriately detect and respond to crisis signals?
- A — Accessibility: Is the tool equitably available across populations, languages, literacy levels?
- MH — Mental Health Specificity: Is the tool calibrated to mental health contexts, or is it a general tool applied to clinical settings?
You will use VERA-MH as an evaluation rubric in Module 4.
Practical Exercise 1.2: The VERA-MH Baseline Audit
Select any AI tool you currently use or are considering using. This might be ChatGPT, a clinical documentation assistant, a mood-tracking app, or an AI note-taking tool. Score it on each VERA-MH dimension using a 1–5 scale. Write a paragraph justifying each score. Do not research the tool's marketing materials—score based only on your direct experience or observation.
Save this audit. You will refine it in Module 4 with stronger evaluative criteria.
1.3 The Evidence Base: What We Actually Know
The 2025 JMIR Mental Health scoping review 4 synthesized findings from over 60 studies on ChatGPT in mental health contexts. This is the most comprehensive evidence synthesis available, and its findings are neither uniformly optimistic nor uniformly cautionary.
Where ChatGPT Demonstrates Strength:
- Diagnostic accuracy on structured presentations: In studies using standardized case vignettes, ChatGPT performed comparably to clinicians on identifying primary diagnoses for presentations with clear diagnostic criteria
- Psychoeducation generation: ChatGPT generates accurate, readable psychoeducation on common conditions (depression, anxiety, PTSD, OCD) at multiple literacy levels
- Treatment knowledge: The model demonstrates strong knowledge of evidence-based treatment modalities and can accurately describe CBT, DBT, ACT, and interpersonal therapy techniques
- Literature synthesis: For literature review tasks with post-training-cutoff awareness, ChatGPT synthesizes existing research coherently
Where ChatGPT Demonstrates Systematic Limitations:
- Complex comorbid presentations: Performance degrades significantly with presentations involving multiple comorbidities, personality disorder features, or atypical presentations
- Prognostic accuracy: ChatGPT's ability to predict treatment response or illness trajectory is poor and should not be used for prognostic reasoning
- Crisis detection sensitivity: The model misses subtle crisis indicators that experienced clinicians detect through accumulated pattern recognition
- Cultural competence: Training data skews toward Western, English-language, educated populations. Cultural formulations for non-dominant groups are less reliable
- Longitudinal coherence: Without explicit memory mechanisms, ChatGPT cannot track a patient's trajectory over time the way a treating clinician can
The Critical Takeaway
AI performs best on the tasks that resemble its training data most closely: structured, text-based, pattern-matching problems with clear criteria. It performs worst on the tasks that require what makes human clinicians irreplaceable: integration of relational data, cultural attunement, real-time safety assessment, and longitudinal therapeutic judgment.
This is a structural property of the technology, not a flaw awaiting a fix. Understanding it positions you to use AI where it genuinely helps and to resist using it where it genuinely cannot.
ETHICAL CHECKPOINT 1.3
Consider a patient you have seen recently (no identifying information). Identify one aspect of their care that AI might have supported, and one aspect where AI assistance would have been inappropriate or insufficient. Write one sentence for each.
Module 1 Resources
- JMIR Mental Health Scoping Review on ChatGPT — search "ChatGPT scoping review 2025"
- Analytics Insight: Mental Health AI Market 2025
- American Telemedicine Association 2026 Policy Updates
- APA Guidelines on AI in Psychological Practice
Module 2: Clinical Prompt Engineering
Weeks 3–5 | Target Competency Level: Operational
2.1 Why Prompting Is a Clinical Skill
Most clinicians who find AI tools unhelpful have not encountered a bad AI tool. They have encountered a good AI tool with a bad prompt.
Prompt engineering is the practice of constructing inputs to AI systems in ways that reliably elicit useful, safe, and accurate outputs. In general contexts, it is a productivity skill. In clinical contexts, it is a patient safety skill. A poorly constructed prompt to an AI system assisting with risk assessment can produce a confidently stated but incorrect output that influences a clinical decision. This is not hypothetical.
I want to be explicit about something that most AI training materials gloss over: prompt engineering does not fix the structural limitations of LLMs described in Module 1. A brilliant prompt cannot make ChatGPT reliably accurate on prognostic questions or reliably sensitive to subtle crisis indicators. What prompt engineering does is help you get the most accurate output possible within the genuine capabilities of the tool, and help you construct guard rails that reduce the probability of harmful output.
The Three-Layer Prompt Architecture
Effective clinical prompts operate on three layers simultaneously:
Layer 1: Role and Context Establish who you are, what you are trying to accomplish, and the professional context. LLMs perform better when they understand the expertise level of the user and the purpose of the task.
Layer 2: Task Specification Be precise about what you want. Vague requests produce vague outputs. Clinical tasks require clinical specificity.
Layer 3: Output Constraints Tell the model what you do not want, what format you need, and what safety behaviors are required for the context.
Example: Weak vs. Strong Clinical Prompt
Weak: "What can you tell me about treating depression?"
Strong: "I am a licensed psychologist working with a 34-year-old patient who has been diagnosed with moderate unipolar major depressive disorder and has a comorbid GAD diagnosis. She has not responded to two adequate SSRI trials (sertraline 200mg for 16 weeks; escitalopram 20mg for 12 weeks). Please summarize the current evidence base for augmentation strategies and next-step psychotherapy options. Format your response as a bulleted clinical summary. Flag any recommendations where evidence quality is low or where my patient's profile might affect applicability. Do not include general lifestyle recommendations."
The strong prompt produces a focused, clinically relevant output. The weak prompt produces a general health article.
2.2 The CRAFT Prompt Framework
I have developed the CRAFT framework specifically for mental health clinical prompting. It is a decision architecture, not a rigid formula.
C — Clinical Context
Establish the clinical context without violating patient privacy. Describe presentation features, diagnosis category, treatment history, and any factors that constrain your options—while keeping the patient unidentifiable.
R — Role Assignment
Assign the AI a functional role. "Act as a knowledgeable clinical consultant" produces different outputs than "summarize research literature" or "generate a patient handout." Be deliberate about which role you need.
A — Action Specification
State the specific task with precision. Use action verbs: summarize, generate, compare, identify, draft, list, evaluate. Avoid: "tell me about," "explain," "help me understand."
F — Format Directive
Specify the output format: bulleted list, table, narrative paragraph, decision tree, numbered steps. Format matching your use case reduces editing time and increases clinical utility.
T — Truth Guardrails
Build in explicit instructions for uncertainty disclosure. Include phrases like: "If you are uncertain, say so explicitly," "Flag low-quality evidence," "Identify where my clinical context might affect the applicability of your response."
Practical Exercise 2.2: CRAFT Application
Write three prompts using the CRAFT framework for the following clinical tasks:
- Generating a psychoeducation handout on sleep hygiene for a patient with comorbid insomnia and bipolar disorder type II
- Summarizing the evidence base for exposure and response prevention (ERP) for OCD in adolescents
- Drafting a letter of medical necessity for a trauma-focused CBT program for a patient with PTSD following a motor vehicle accident
Exchange your prompts with a colleague if possible. Evaluate each other's prompts against the CRAFT framework before running them through an AI tool.
2.3 Privacy-Preserving Prompting: The De-identification Protocol
This section is non-negotiable. Every time you prompt an AI system using any patient-related information, you are making a data decision with privacy and HIPAA implications.
The Core Rule No protected health information (PHI) in prompts to general-purpose AI systems, ever. This includes general-purpose versions of ChatGPT, Claude, Gemini, and similar tools that do not have executed Business Associate Agreements (BAAs).
What Counts as PHI Under HIPAA's 18 identifiers: names, geographic subdivisions smaller than state, dates (other than year) related to individuals, phone/fax numbers, email addresses, SSNs, medical record numbers, health plan beneficiary numbers, account numbers, certificate/license numbers, VINs, device identifiers, URLs, IP addresses, biometric identifiers, full-face photographs, and any other unique identifier.
The De-identification Protocol
When you need to prompt an AI about a patient presentation, use the following protocol:
- Strip all 18 PHI identifiers from your description
- Replace specifics with categories: "a 40-year-old man" not a name; "a patient in the southeast" not a city; "three years ago" not a specific date
- Fictionalize irrelevant specifics: Change occupation to a comparable one; alter family structure slightly; modify demographic details that are not clinically relevant
- Review before submitting: Read your prompt as if you were a stranger. Could you identify the patient? If yes, de-identify further.
- Document your de-identification decision in your own records
ETHICAL CHECKPOINT 2.3
Consider: does your current workplace have a policy on using general-purpose AI tools for clinical tasks? If yes, are you following it? If no, what would a responsible policy include? Write a one-page draft policy statement. This exercise is both ethical practice and professional development.
2.4 Advanced Prompting: Chain-of-Thought and Self-Critique
As you become more comfortable with basic CRAFT prompting, two advanced techniques substantially improve clinical output quality.
Chain-of-Thought Prompting Instead of asking for a conclusion, ask the AI to reason through a problem step by step before giving you its answer. This technique was shown to improve accuracy on complex reasoning tasks in the original research context [8], and it has direct clinical application.
Example: "Before giving me your assessment, walk through the differential diagnostic considerations for this presentation systematically, starting with the most common explanations and moving to less common ones. Then give me your summary."
Chain-of-thought prompting makes the AI's reasoning visible. You can spot incorrect assumptions or missed considerations before they influence your clinical thinking.
Self-Critique Prompting After receiving an output, prompt the AI to critique its own response: "What are the limitations of what you just told me? Where might your response be inaccurate or incomplete? What additional information would have improved your answer?"
This technique counteracts sycophancy (the tendency of LLMs to validate rather than challenge). A well-prompted self-critique often surfaces important caveats the model omitted in its initial response.
Prompt Refinement: Let the Model Rewrite Your Prompt A surprisingly effective technique: after drafting your CRAFT prompt, ask the LLM to rewrite it before you run it. The instruction is simple: "Rewrite this prompt to be clearer, more specific, and better structured for an LLM to respond to accurately. Preserve my clinical intent." LLMs have implicit knowledge of what input structures produce their best outputs. Letting the model restructure your prompt in its preferred syntax often produces noticeably better results than your original wording, especially for complex clinical queries. Review the rewritten version to make sure your intent survived the translation, then use the refined version going forward.
Practical Exercise 2.4: The Chain-of-Thought Diagnostic Exercise
Present a de-identified complex clinical vignette to ChatGPT or Claude using two prompts:
Prompt A (direct): "What is the most likely diagnosis for this presentation?"
Prompt B (chain-of-thought): "Please work through the differential diagnosis for this presentation systematically before giving me your assessment. Consider at least four possible diagnoses and explain what evidence supports or argues against each before reaching a conclusion."
Compare the two outputs. Document: (a) how the chain-of-thought response differed in quality; (b) whether the final diagnostic conclusion differed; (c) what the chain-of-thought reasoning revealed that the direct response missed.
Module 2 Resources
- JMIR Clinical Tutorials on Prompt Engineering — search "clinical prompt engineering 2025"
- OpenAI Usage Policies for Healthcare
- HHS HIPAA Guidance on AI Tools
- Anthropic Claude for Professional Use
Module 3: AI in the Clinical Workflow
Weeks 6–8 | Target Competency Level: Operational → Critical
3.1 Mapping Your Workflow: Where AI Helps and Where It Does Not
The single most common mistake clinicians make when integrating AI is adopting a tool before mapping the problem it is meant to solve. This produces workflows where AI is used because it is available, not because it serves a clinical function.
I want to introduce a framework I call the Clinical Value Chain, the sequence of activities that constitute clinical work from referral to termination. For each link in this chain, we will examine AI's genuine utility and its genuine limits.
The Clinical Value Chain
Referral → Intake → Assessment → Formulation → Treatment Planning
→ Session Delivery → Documentation → Between-Session Support
→ Progress Monitoring → Consultation → Termination → Outcome Review
Where AI Currently Adds Genuine Value:
| Workflow Stage | AI Application | Evidence Quality |
|---|---|---|
| Assessment | Psychoeducation generation; structured interview support | Moderate [4] |
| Formulation | Literature synthesis; differential consideration | Moderate |
| Treatment Planning | Protocol identification; homework generation | Moderate |
| Documentation | Progress note drafting; template generation | Emerging |
| Between-Session | Psychoeducation delivery; mood tracking | Low-Moderate |
| Consultation | Case conceptualization support; supervision prep | Emerging |
| Outcome Review | Data summarization; visualization | Emerging |
Where AI Currently Adds Limited or Negative Value:
| Workflow Stage | Why AI Falls Short |
|---|---|
| Session Delivery | Therapeutic alliance is irreducibly human |
| Risk Assessment | Hallucination risk is clinically unacceptable |
| Crisis Intervention | Real-time safety judgment requires contextual human presence |
| Trauma Processing | Relational attunement cannot be replicated |
| Prognostic Reasoning | Evidence shows poor accuracy [4] |
ETHICAL CHECKPOINT 3.1
Locate the workflow stage where you most want to use AI assistance. Write out the specific clinical risk if that assistance produces an inaccurate output. Then write out how you would detect that error before it reached the patient. If you cannot articulate a reliable error detection mechanism, do not implement that use case yet.
3.2 Documentation Assistance: The Highest-ROI Application
Of all the workflow applications of AI for mental health clinicians, documentation assistance has the highest ratio of benefit to risk and the most immediate impact on clinical sustainability.
The average clinician spends 30–40% of their work time on documentation [9]. AI-assisted documentation can reduce this by 40–60% without compromising quality. In some cases it improves quality by ensuring structural completeness.
The Documentation Assistance Protocol
There are three models for AI-assisted documentation, each with different risk profiles:
Model A: Template Generation
You ask AI to generate structured templates for note types (progress notes, intake summaries, treatment plans, letters of medical necessity). You fill in the template with patient-specific content.
Risk level: Low. No PHI involved. AI is generating structure, not content.
Model B: Draft Generation from Clinician Summary
You write a brief, de-identified clinical summary of the session. AI drafts a progress note from your summary. You review, correct, and complete.
Risk level: Low-Moderate. De-identification is your responsibility. Review is mandatory.
Model C: Ambient Documentation Integration
Specialized clinical tools (not general AI) transcribe and structure session notes using HIPAA-compliant platforms with BAAs.
Risk level: Moderate. Requires vetting the platform's BAA, security infrastructure, and opt-in consent from patients.
The Non-Negotiable Documentation Review Protocol
Regardless of which model you use, every AI-assisted note requires your clinical review before it enters the medical record. This means:
- Read the entire note, not just the structure
- Verify that every clinical claim reflects your actual clinical judgment
- Correct any hallucinated details, incorrect clinical terminology, or implausible content
- Add nuance the AI could not capture (nonverbal observation, relational tone, clinical intuition)
- Sign only when the note represents your clinical work, not the AI's best guess at it
Practical Exercise 3.2: The Documentation Template Library
Build a personal documentation template library using AI assistance. Using the CRAFT framework, prompt an AI to generate templates for:
- Initial psychiatric/psychological assessment summary
- Weekly progress note (your preferred format: SOAP, DAP, or BIRP)
- Treatment plan with measurable goals
- Safety planning documentation
- Termination summary
Review each template against your state licensing board's documentation requirements and your malpractice carrier's documentation guidance. Revise accordingly. You now have a reusable, AI-assisted documentation infrastructure.
3.3 Between-Session Support: What AI Can and Cannot Do for Your Patients
An important and ethically complex question: should your patients be using AI tools between sessions?
The evidence base is nuanced. Some digital tools (Tier 1 digital therapeutics) have real evidence for specific conditions in adjunctive roles. General-purpose AI chatbots used autonomously by patients for mental health support represent a different and significantly more complex risk profile.
What the Evidence Shows
Patients are already using AI tools for mental health support, with or without clinician guidance 4. A JMIR Mental Health review found that patients reported finding AI chatbots useful for psychoeducation, skill practice, and between-session support. They also reported distress when AI responses felt dismissive, failed to recognize crisis states, or provided inaccurate information 4.
This means the clinical decision is not "should my patients use AI?" but rather "do I want to actively guide my patients' AI use, or leave it to chance?"
The Guided AI Use Framework
If you choose to actively support patients in appropriate AI use:
- Assess digital literacy and access — not all patients have the technology, skills, or language support to use AI tools safely
- Recommend specific tools for specific purposes — distinguish between an FDA-authorized digital therapeutic for insomnia (appropriate) and encouraging a patient to process trauma with ChatGPT (inappropriate)
- Set explicit limits — discuss clearly what AI tools cannot do, particularly around crisis support
- Review what patients share with AI tools — make between-session AI use a topic in session when relevant
- Monitor for substitution — watch for patients using AI interaction to avoid the relational work of therapy
The Crisis Escalation Non-Negotiable
Any patient-facing AI interaction must have a clear, tested crisis escalation pathway. This is not optional. If you recommend or tolerate a patient using an AI tool, you are responsible for knowing that the tool reliably directs suicidal patients to emergency resources.
Test it yourself before recommending it. Literally ask the tool about suicidality and observe the response.
ETHICAL CHECKPOINT 3.3
If one of your patients disclosed that they had been using ChatGPT as their primary mental health support for three months, what would your clinical and ethical response be? Write a brief case formulation of the risks and opportunities this information presents. What would you address first in session?
3.4 AI-Assisted Continuing Education and Clinical Consultation
One of the highest-value and lowest-risk applications of AI for clinicians is continuing education: using LLMs as research assistants, literature synthesizers, and on-demand educators.
Research Assistance Applications:
- Synthesizing literature on a treatment approach you are encountering for the first time
- Generating a structured summary of a clinical guideline
- Explaining a statistical method used in a study you are reading
- Identifying gaps or contradictions in your understanding of a clinical area
- Drafting questions for supervision or peer consultation
The Verification Imperative
AI-generated literature synthesis must be verified against primary sources before you act on it clinically. This is not bureaucratic caution. Hallucinated citations are a documented failure mode of every major LLM. A model may confidently cite a study that does not exist, attribute claims to the wrong paper, or misrepresent findings.
The verification workflow: AI synthesis → identify specific claims → locate and read primary sources → update your understanding based on primary sources, not AI summary.
Practical Exercise 3.4: The Research Synthesis Workflow
Identify a clinical question you have encountered in the past month that you did not fully investigate due to time constraints. Using CRAFT prompting:
- Ask an AI to synthesize the current evidence on this question
- Identify five specific claims in the synthesis
- Locate the primary sources for at least three of those claims
- Compare the AI synthesis to the actual study findings
- Document any discrepancies, hallucinated citations, or misrepresented findings
This exercise builds the habit of verification and calibrates your trust in AI research assistance.
Module 3 Resources
- APA Practice Guidelines on Technology in Therapy
- HIPAA Journal: AI Documentation Tools Compliance
- Nate's Newsletter on Mental Health AI Trends
- HITC: Digital Mental Health Market Analysis
- American Telemedicine Association Clinical Practice Guidelines
Module 4: Assessment, Diagnosis, and Evidence Review
Weeks 9–11 | Target Competency Level: Critical
4.1 AI in Diagnostic Reasoning: Capabilities and Hard Limits
This module addresses the highest-stakes clinical application of AI: its role in assessment and diagnostic reasoning. I will give you the evidence directly, without the optimism of vendor marketing or the reflexive dismissal of technology skeptics.
Where Diagnostic AI Genuinely Helps
The JMIR scoping review 4 and independent studies have identified specific diagnostic functions where AI adds legitimate value:
1. Structured Differential Generation When given a complete clinical presentation, LLMs generate comprehensive differentials that compare favorably to those generated by clinicians, particularly for presentations where a less common diagnosis should be on the differential. AI does not forget DSM criteria. It does not anchor prematurely on the most salient feature of a presentation.
2. Criteria Checklist Completeness AI can be used to systematically check that all diagnostic criteria have been assessed. Clinicians sometimes make incomplete criteria assessments, particularly under time pressure. An AI prompted to verify DSM-5-TR criteria completeness is a useful audit tool.
3. Comorbidity Mapping For patients with multiple conditions, AI can help map the interaction between diagnoses and the treatment literature specific to comorbid combinations.
Where Diagnostic AI Fails
1. Prognostic Accuracy Studies consistently show that AI prognostic reasoning (predicting treatment response, illness trajectory, recovery timeline) performs poorly 4. Do not use LLMs for prognosis.
2. Complex, Atypical Presentations AI diagnostic accuracy degrades significantly with presentations that deviate from the modal clinical description in training data. If your patient's presentation is unusual, AI is less reliable, not more. You need it most precisely when it helps least.
3. Personality Disorder Assessment LLMs demonstrate particular weakness in personality disorder formulation, where the relational and longitudinal data that informs assessment is most critical and most absent from an AI's context window.
4. Cultural Formulation AI's cultural competence remains skewed toward Western, English-language presentations. Cultural formulation interviews should not be delegated to or substantially influenced by AI tools.
4.2 The VERA-MH Evaluation Framework: Applied Assessment
Earlier in this curriculum, you performed a baseline VERA-MH audit. You now have the theoretical and practical context to apply the framework with rigor.
As noted in Module 1, VERA-MH is one of several available evaluation frameworks and was developed by a for-profit entity. Consider supplementing your VERA-MH evaluation with other frameworks such as FAITA-MH or published academic rubrics to offset any single framework's blind spots.
VERA-MH: Full Clinical Application
Validity Assessment
To assess an AI tool's validity for a specific clinical purpose:
- Identify five structured clinical scenarios relevant to your practice population
- Present each scenario to the tool without modification
- Evaluate outputs against authoritative clinical criteria (DSM-5-TR, clinical practice guidelines, your own clinical judgment)
- Calculate an accuracy rate and document systematic error patterns
- Identify the clinical contexts where the tool is reliable and where it is not
Engagement Assessment
Engagement is not about whether the AI is pleasant to interact with. It is about whether the tool's interaction style is clinically appropriate for the patient population it serves:
- Does the tool maintain appropriate boundaries?
- Does it respond to distress signals with clinical appropriateness?
- Does it avoid creating dependency, false intimacy, or inappropriate reassurance?
- Does it communicate uncertainty rather than projecting false confidence?
- Does it avoid creating dependency, false intimacy, or inappropriate reassurance?
Risk Management Assessment
This dimension requires specific testing:
- Present the tool with a crisis presentation (suicidal ideation, expressed intent to harm)
- Document whether the tool: (a) recognized the crisis; (b) responded appropriately; (c) provided accurate emergency resources; (d) maintained a safe interaction through to appropriate referral
- Repeat with presentations of varying subtlety
- Any tool that fails to reliably recognize or respond to crisis signals is not appropriate for patient-facing use
Accessibility Assessment
- Test the tool in languages relevant to your practice population
- Assess reading level of outputs against population literacy norms
- Identify whether the tool is available on platforms accessible to low-income users
- Note whether the tool's interface is usable by individuals with low digital literacy
Mental Health Specificity Assessment
- Does the tool have clinical training or calibration, or is it a general-purpose tool?
- Does it use clinically accurate terminology?
- Does it understand mental health-specific ethical constraints (confidentiality, mandatory reporting, scope of practice)?
Practical Exercise 4.2: The Comparative Tool Audit
Select two AI tools you have access to (e.g., ChatGPT-4o and Claude Sonnet). Apply the full VERA-MH framework to each for a specific clinical application relevant to your practice. Write a comparative evaluation report, no more than three pages, that includes:
- Your methodology
- Scores and rationale for each VERA-MH dimension
- Clinical recommendation: for what purposes, if any, would you use each tool?
- Limitations of your evaluation
This report serves as your personal evidence base for tool selection.
4.3 Validated Instruments and AI: A Critical Boundary
A recurring and important clinical question: can AI administer, score, or interpret validated psychological instruments?
The Short Answer AI can assist with logistics around validated instruments. It cannot replace the standardized administration, normative interpretation, and clinical judgment that validated instruments require.
What AI Can Appropriately Do:
- Generate the text of freely available instruments (PHQ-9, GAD-7, PCL-5) in accessible language
- Explain what an instrument measures and how scores are interpreted generally
- Identify appropriate instruments for a specific clinical question
- Generate patient-friendly psychoeducation about the purpose of assessment
What AI Cannot Appropriately Do:
- Substitute for standardized administration protocols
- Generate normative interpretations for specific patients
- Administer instruments with proprietary copyright without licensing
- Replace clinical judgment in integrating instrument findings with the full clinical picture
ETHICAL CHECKPOINT 4.3
Consider a scenario where an AI tool your practice is considering purchasing claims to administer and interpret standardized psychological assessments autonomously, and bills payers for psychological assessment services. Write a brief analysis of: (a) the ethical issues this raises; (b) the regulatory questions it creates; (c) what questions you would ask the vendor before proceeding.
4.4 AI-Assisted Outcome Monitoring
Outcome monitoring is one of the most evidence-based practices in psychotherapy and one of the most consistently underpracticed. AI offers genuine support here.
Applications in Outcome Monitoring:
- Data summarization: AI can summarize trends across repeated administrations of outcome measures, making longitudinal patterns visible
- Progress note analysis: AI can audit your own progress notes for language patterns that may indicate treatment plateau or alliance rupture
- Treatment response identification: AI can flag presentations where standard treatment is not producing expected response and recommend consideration of augmentation or referral
- Outcome reporting: AI can help generate clear outcome summaries for treatment team communication or payer reporting
The Human Judgment Requirement
Outcome data summarized by AI requires clinical interpretation. A PHQ-9 dropping from 18 to 12 may represent genuine improvement, a patient's awareness that lower scores produce less clinical concern, a transient good week, or the paradoxical improvement sometimes seen before acute deterioration. AI sees the number. You see the person.
Module 4 Resources
- VERA-MH Framework: NIH/PMC 2025 — search "VERA-MH chatbot evaluation mental health 2025"
- JMIR Mental Health: ChatGPT Scoping Review
- DSM-5-TR Online Access via APA
- Frontiers in Public Health: AI in Psychiatry Education 2025
- JMIR Medical Education: AI Clinical Training
Module 5: Ethics, Risk, and Governance
Weeks 12–14 | Target Competency Level: Critical → Leadership
5.1 The IEACP Five-Stage Ethical Decision Framework
The International Ethics for AI in Clinical Practice (IEACP) framework 5 represents the most comprehensive ethical decision structure currently available for clinicians navigating AI integration. It is not merely a checklist. It is a reasoning scaffold.
The five stages are designed to be worked through sequentially when making any significant decision about AI integration: adopting a new tool, changing a workflow, responding to an ethical challenge, or advising a patient on AI use.
Stage 1: Situation Appraisal
Before making an ethical judgment, ensure you have correctly described the situation. In AI contexts, this means:
- What exactly is the AI doing in this scenario?
- Who has access to what data, and under what conditions?
- What are the actual capabilities and limitations of this tool (not the vendor's claims)?
- Who bears the consequences if this tool produces an incorrect output?
Most ethical errors in AI deployment begin as errors of situation appraisal. A clinician who believes they are using a HIPAA-compliant tool because it says so in the marketing materials, without reviewing the BAA, has misappraised the situation.
Stage 2: Information Gathering
What do you need to know to reason well about this situation?
- What does the evidence say about this tool's effectiveness and risks?
- What do your professional ethics codes say about technology use?
- What does your state licensing board require or prohibit?
- What does your malpractice carrier's policy cover?
- What do your patients need to know to give informed consent?
Stage 3: Values Clarification
When an ethical tension exists, identify the specific values in tension. In mental health AI contexts, common value tensions include:
- Access vs. Quality: AI may expand access to mental health support while reducing quality for individuals who receive it
- Efficiency vs. Relational depth: Documentation AI saves time but may distance clinician attention from relational observation
- Innovation vs. Precaution: Early adoption may benefit some patients; waiting for better evidence protects others from harm
- Individual autonomy vs. Beneficence: Patients have the right to use AI tools that clinicians believe may harm them
Stage 4: Ethical Analysis
Apply your professional ethical framework to the clarified values. The APA Ethics Code, NASW Code of Ethics, AAMFT Code of Ethics, and NBCC Code of Ethics all address technology use in varying degrees. Know what your code requires.
Key principles that apply across frameworks:
- Competence: You must have competence in tools you use. Lack of AI training does not exempt you from competence requirements.
- Informed consent: Patients have the right to know when AI tools are part of their care
- Nonmaleficence: The burden of proof is on demonstrating that a tool does not harm, not on demonstrating harm before withdrawing it
- Justice: Equitable access to benefits of AI is an ethical requirement, not an aspiration
Stage 5: Action and Documentation
Make a decision and document your reasoning. In contexts of genuine uncertainty, documenting that you identified the ethical tension, gathered relevant information, clarified the values at stake, applied your professional ethical framework, and made a reasoned decision is itself evidence of ethical practice.
This documentation is also your protection. If a decision made with thorough ethical reasoning produces a negative outcome, your reasoning process is your professional defense.
ETHICAL CHECKPOINT 5.1
Apply the IEACP five-stage framework to this scenario: A telehealth platform you are contracted with has added an AI-powered "between-session support" feature that your patients can access at any time, including in crisis states. You were not consulted before the feature was added. Patients have already started using it. Work through all five stages. Do not skip the discomfort.
5.2 Informed Consent for AI-Assisted Care
Informed consent for AI-integrated care is an ongoing clinical conversation, not a checkbox.
The Three Components of AI Informed Consent
1. Disclosure: Patients must be told, in plain language, when and how AI tools are part of their care. This includes:
- Documentation assistance (if AI is used to draft notes)
- Between-session AI tools (if recommended or tolerated)
- AI-assisted assessment or treatment planning (if used)
- Any data sharing that AI tools require
2. Explanation: Patients must understand, at a level appropriate to their health literacy:
- What the AI does
- What data it accesses or generates
- What the AI cannot do and where human clinical judgment takes precedence
- How to raise concerns about AI use in their care
3. Authentic Choice: Patients must have a genuine, non-coercive option to decline AI involvement in their care without penalty. If the only note-taking option available to them involves AI transcription, they must be informed of this and offered an alternative if they decline.
The Consent Documentation Template
The following language can be adapted for your informed consent documentation:
"This practice uses [specific AI tools] to assist with [specific functions—e.g., documentation, psychoeducation generation, treatment planning support]. These tools are used by your clinician, not autonomously. Your clinician reviews all AI-generated content and is responsible for all clinical decisions. No AI tool has access to your personal identifying information [modify as appropriate for your specific tools and data agreements]. You have the right to decline the use of AI tools in your care and to ask questions about how these tools are used at any time."
5.3 Liability, Malpractice, and Professional Accountability
Mental health AI creates new liability exposure that most malpractice carriers have not yet fully addressed in their policy language. You need to understand the current state of play.
Current Liability Framework
Under current legal and regulatory frameworks in the United States, the clinician remains the responsible party for clinical decisions, even when those decisions are informed or supported by AI tools. An AI tool that produces an incorrect clinical output does not bear liability. You do.
This reflects the longstanding principle that professional responsibility cannot be delegated to a tool.
Liability Risk Scenarios
- An AI-drafted progress note you signed without reviewing contains a clinical error that later becomes material in a malpractice claim
- A patient relies on an AI tool you recommended and experiences harm from its failure to recognize a crisis
- An AI-assisted diagnostic conclusion influences a treatment decision that produces an adverse outcome
- PHI is disclosed through an AI tool that you used without verifying HIPAA compliance
Risk Mitigation Protocol
- Review everything you sign. AI drafts are drafts. Your signature is your professional attestation.
- Document your review process. Note that you reviewed and modified AI-generated content.
- Know your malpractice carrier's AI policy. Call them. Ask explicitly whether AI-assisted documentation and clinical decision support are covered.
- Keep current with licensing board guidance. At least seven state licensing boards issued AI-specific guidance between 2024 and 2026. Know what your board has said.
- Never delegate safety-critical decisions to AI. Risk assessment, crisis intervention, and mandatory reporting are your responsibilities, not the AI's.
ETHICAL CHECKPOINT 5.3
Review your current malpractice insurance policy (or your training program's policy if you are pre-licensed). Identify what it says about technology use. If it is silent on AI, draft a list of five questions to ask your carrier. If you are unable to get clear answers from your carrier, that is clinically relevant information about your risk exposure.
5.4 Equity, Access, and the Justice Imperative
Mental health AI is not equity-neutral. This section will not be comfortable. It should not be.
Who Benefits from Current AI Tools
Current mental health AI tools tend to benefit users who are: English-speaking, digitally literate, have reliable internet access, are comfortable with text-based interaction, present with common diagnoses that are well-represented in training data, and have the disposable income or insurance coverage to access premium AI tools.
Who Is Disadvantaged
Users who are disadvantaged by current AI tools include: speakers of languages other than major Indo-European languages, individuals with low digital literacy, individuals without reliable internet access or smartphones, populations with presentations that deviate from Western diagnostic norms, older adults with low technology familiarity, and individuals whose cultural expressions of distress are poorly represented in training data.
The Dual Risk of AI in Mental Health Access
There is a genuine tension here. One legitimate argument for mental health AI is that it expands access to mental health support for populations who currently have none: rural communities, underserved populations, individuals on long waitlists. There is preliminary evidence supporting this. An imperfect AI interaction may be better than no support.
The counter-argument is that AI tools primarily improve access for people who already have reasonable access, while their limitations most severely affect populations with the fewest alternatives. The evidence for this is also real 4.
Your ethical responsibility is to hold both of these truths simultaneously and to advocate for AI development, deployment, and policy that takes equity as a design constraint, not an afterthought.
Practical Exercise 5.4: The Equity Audit
Conduct an equity audit of one AI tool currently in use in your practice or your training context. Assess:
- Language availability and quality (test in relevant languages)
- Reading level of outputs (use a Flesch-Kincaid tool)
- Cost and platform access requirements
- Cultural responsiveness on presentations common in your practice population
- Availability of accessibility features (screen reader compatibility, audio options, font size adjustment)
Write a one-page equity report and share it with your supervisor or a trusted colleague.
Module 5 Resources
- IEACP Ethics Framework: MDPI Healthcare 2025 — search "IEACP AI clinical ethics 2025"
- VERA-MH: NIH/PMC 2025
- APA Ethics Code on Technology
- NASW AI in Social Work Guidance
- Office for Civil Rights: HIPAA and AI
- Nate's Newsletter: Ethics in Mental Health AI
Module 6: Advanced Integration and Future Practice
Weeks 15–16 | Target Competency Level: Leadership
6.1 AI Agents: The Next Frontier
Generative AI in mental health is currently understood primarily through the lens of individual tools used by individual clinicians. The next significant shift, already underway in research and development contexts, is AI agents.
What AI Agents Are
AI agents are systems that do not merely respond to prompts. They plan sequences of actions, use tools autonomously (searching the web, reading files, sending messages, making API calls), and operate over extended time horizons to complete complex tasks. An AI agent for mental health might autonomously: review incoming intake information, generate a preliminary case conceptualization, identify appropriate evidence-based protocols, draft a treatment plan, schedule appointments, and send between-session psychoeducation, all without a clinician initiating each step.
This is not science fiction. Basic versions of this architecture are already deployed in healthcare-adjacent settings. The mental health application is a matter of time and regulatory clarity.
What This Means for Clinicians
The skills this curriculum has developed (critical evaluation, ethical reasoning, workflow integration, VERA-MH auditing, IEACP application) are the exact skills you will need to evaluate AI agent systems. The stakes are higher because agents act autonomously. The error rate of each individual step compounds across the action sequence. An agent that makes a modest error early in a workflow may produce a significantly wrong output by the end.
The Oversight Imperative
Human oversight of AI agents is not optional. As agents become more capable, the pressure to reduce oversight (in the name of efficiency) will increase. Your professional response is to articulate clearly what oversight means, what it requires, and what the consequences of its absence are—and to hold that line in your institutional context.
The Personalized Mental Health Horizon
The integration of AI with neuroscience and genomics data opens the possibility of genuinely personalized mental health treatment matched not just to diagnosis but to individual biological, psychological, and social profiles. Early research in computational psychiatry suggests this is achievable in principle for treatment selection in depression and other conditions [10].
This horizon is both exciting and demanding. It requires that clinicians understand enough about data science to evaluate personalized AI claims and distinguish genuine personalization from sophisticated marketing.
6.2 The Clinician-as-Leader: Shaping Policy Before Policy Shapes You
Module 6 is about your role in the field, not just in your consulting room.
The 40% of psychologists planning AI integration by 2026 without training 3 are not the only stakeholders shaping how AI enters mental health practice. Vendors, administrators, payers, legislators, and technology developers are all actively building the infrastructure that will define clinical AI for the next decade.
Clinicians who have engaged this curriculum are among the most prepared people in any room discussing mental health AI policy. That preparation is a professional obligation to show up in those rooms.
Where Clinician Voice Is Needed
- Licensing board AI guidance development: Boards in most states are still developing AI policies. Public comment processes are accessible and underutilized.
- Institutional AI governance: If your workplace is adopting AI tools, there should be a clinical voice in the governance process. Volunteer for it.
- Professional association advocacy: APA, NASW, AAMFT, ACA, and NBCC are all developing AI positions. Their advocacy committees need clinicians who understand the technology.
- Training program curriculum: Graduate programs are only beginning to integrate AI training. If you supervise or teach, you can introduce this content now.
- Peer consultation: Your colleagues are navigating this territory without training. You can change that.
Practical Exercise 6.2: The Policy Statement
Draft a one-page policy statement on a specific mental health AI issue you care about. This might be:
- Informed consent requirements for AI in therapy
- Criteria for Medicare reimbursement of AI-assisted mental health interventions
- Licensing board competency requirements for AI integration
- Equity standards for mental health AI deployment
Write it as if you were submitting it to a professional association's policy committee—because you can. The APA, NASW, and AAMFT all accept policy input from members.
6.3 The Learning Clinician: Staying Current in a Fast-Moving Field
Mental health AI is moving faster than any continuing education calendar can track. The evidence base that was current when you began this curriculum will be partially outdated within 18 months. This is not a problem to be solved. It is a condition to be managed.
The Continuous Learning Architecture
Build a system, not a one-time intervention:
Quarterly Literature Review Set a calendar reminder every quarter to spend 90 minutes reviewing new publications in mental health AI. Key journals: JMIR Mental Health, Journal of Medical Internet Research, NPJ Mental Health Research, Frontiers in Psychiatry, Psychiatric Services.
Peer Consultation Circle Establish or join a peer consultation group with explicit focus on technology integration. Meet monthly. Share what you have tried, what worked, what failed, and what ethical challenges you have encountered.
Professional Association Engagement Follow your professional association's technology-focused divisions, working groups, and listservs. Join one relevant to your population or modality.
Vendor Skepticism Protocol When a new tool enters your orbit—through a conference, a colleague, or your workplace—apply a standard due diligence protocol before adopting it:
- What is the evidence base? (peer-reviewed publications, not white papers)
- What is the regulatory status? (FDA clearance, BAA availability, data practices)
- What do early clinician adopters say? (not testimonials on the vendor website)
- What does VERA-MH evaluation reveal? (conduct your own audit)
- What is the equity profile? (who does this tool serve well, and who does it underserve?)
ETHICAL CHECKPOINT 6.3: The Final Reflection
Return to the one-paragraph answer you wrote in Ethical Checkpoint 1.1: "What is the boundary between a tool that assists my clinical judgment and a tool that replaces it?"
Rewrite it. Use what you know now that you did not know then. Where has your thinking become more precise? Where has it become more uncertain—which is itself a form of becoming more sophisticated? What question are you still sitting with?
There is no correct answer here, only the quality of your reasoning and your commitment to keeping it honest.
6.4 Integration Capstone: The Personal AI Practice Protocol
You will leave this curriculum with a document you actually use.
The Personal AI Practice Protocol (PAPP) is your written commitment to yourself about how you will integrate AI into your practice. It is a living document. You will revise it as you gain experience and as the field develops. It is also an accountability document, something you can share with a supervisor, a peer consultant, or your licensing board as evidence of deliberate, ethical practice.
PAPP Structure:
Section 1: Approved Applications List the specific AI applications you have decided to use, the specific tools approved for each, and the specific protocols that govern each use (including review procedures and documentation requirements).
Section 2: Excluded Applications List the specific AI applications you have decided not to use and why. This matters: naming what you will not do is as important as naming what you will.
Section 3: Informed Consent Language Include your current informed consent language for AI in your practice.
Section 4: Privacy and Data Protocols Describe your de-identification protocol, your BAA status for relevant tools, and your data security practices.
Section 5: Continuing Education Commitments List your specific commitments to ongoing learning: quarterly literature review schedule, peer consultation group, professional association engagement.
Section 6: Ethical Commitments Write a brief statement of your core ethical commitments around AI. Not a list of rules. A statement of values. This is the document you read when a new tool is being aggressively marketed to you, when your workplace is pressuring faster adoption than you are comfortable with, or when you are not sure whether something you did was right.
Section 7: Review Date Commit to a specific date, six months from today, to review and update this document.
Module 6 Resources
- JMIR Medical Education: AI Curriculum for Mental Health
- Frontiers in Public Health: AI in Psychiatric Training 2025
- APA Technology Practice Guidelines
- Nate's Newsletter: AI Agents in Mental Health
- HITC: Digital Therapeutics Market Forecast 2026
- Analytics Insight: Mental Health AI Trends 2025-2026
- NPJ Mental Health Research
Appendix A: Quick Reference — Ethical Checkpoint Summary
| Checkpoint | Core Question | Module |
|---|---|---|
| 1.1 | What is the boundary between assistance and replacement? | 1 |
| 1.3 | Where would AI help or harm this specific patient? | 1 |
| 2.3 | Does your workplace have an AI policy? What should it say? | 2 |
| 3.1 | What is the clinical risk of AI error in your target workflow? | 3 |
| 3.3 | How would you respond to a patient using ChatGPT as primary MH support? | 3 |
| 4.3 | What are the ethics of autonomous AI assessment billed to payers? | 4 |
| 5.1 | Apply IEACP to an AI between-session feature added without your consent | 5 |
| 5.3 | What does your malpractice policy cover regarding AI? | 5 |
| 6.3 | How has your thinking about AI assistance versus replacement evolved? | 6 |
Appendix B: Decision Tree — Should I Use AI for This Clinical Task?
START: I am considering using an AI tool for a clinical task.
|
v
Does this task involve direct, real-time patient safety decisions
(risk assessment, crisis intervention, mandatory reporting)?
|
YES → Do not use AI. This is your clinical judgment. Stop here.
|
NO
v
Does this task require PHI to be entered into the AI tool?
|
YES → Does the tool have a BAA and verified HIPAA compliance?
|
NO → Do not use this tool for this task. Stop here.
|
YES → Proceed to next checkpoint.
|
NO → Proceed to next checkpoint.
v
Have you verified the tool's accuracy for this specific task type
using VERA-MH or equivalent evaluation?
|
NO → Complete evaluation before clinical use. Stop here.
|
YES → What is your review protocol for AI output before it reaches
patient or medical record?
|
NONE → Develop review protocol before proceeding. Stop here.
|
DEFINED → Proceed with use, apply review protocol, document.
Appendix C: Glossary of Key Terms
AI Agent: An AI system that autonomously plans and executes multi-step tasks using tools and external data sources.
BAA (Business Associate Agreement): A HIPAA-required contract between a covered entity and a vendor that handles PHI, establishing data protection responsibilities.
Chain-of-Thought Prompting: A prompting technique that asks an AI to reason through a problem step by step before producing a conclusion.
CRAFT Framework: Clinical Role Action Format Truth — a five-component prompt architecture developed for mental health clinical applications.
Digital Therapeutic (DTx): Software-based medical intervention that has regulatory authorization (e.g., FDA clearance) and peer-reviewed evidence for a specific clinical indication.
Hallucination: The generation by an LLM of plausible-sounding but factually incorrect content.
IEACP Framework: International Ethics for AI in Clinical Practice — a five-stage ethical decision framework for AI integration in clinical settings 5.
LLM (Large Language Model): A deep learning model trained on large text corpora to generate, summarize, classify, and reason about text.
PHI (Protected Health Information): Information that identifies an individual and relates to health status, treatment, or payment, protected under HIPAA.
Retrieval-Augmented Generation (RAG): An AI architecture that supplements LLM generation with real-time information retrieval, reducing hallucination and training cutoff limitations.
Sycophancy: The tendency of LLMs to validate the framing of user queries rather than challenging incorrect assumptions.
VERA-MH Framework: Validity, Engagement, Risk Management, Accessibility, Mental Health Specificity — a five-dimension evaluation framework for mental health AI chatbots 6.
Appendix D: Full References
[1] Analytics Insight / Market Research Reports (2025). Global Mental Health AI Market Report: USD 8B Valuation and 20.64% CAGR Forecast. Available at: https://www.analyticsinsight.net
[2] Centers for Medicare & Medicaid Services (2025–2026). Medicare Reimbursement Policy for FDA-Authorized Digital Therapeutics. American Telemedicine Association Policy Updates. Available at: https://www.americantelemed.org
[3] American Psychological Association (2025). Technology Integration Survey: AI Adoption Plans Among Licensed Psychologists. Available at: https://www.apa.org/practice
[4] JMIR Mental Health (2025). Scoping Review: ChatGPT Applications in Mental Health — Findings from 60+ Studies. Available at: https://mental.jmir.org
[5] MDPI Healthcare (2025). The IEACP Framework: A Five-Stage Ethical Decision Model for AI Integration in Clinical Practice. Available at: https://www.mdpi.com/journal/healthcare
[6] NIH/PMC (2025). VERA-MH: A Multidimensional Evaluation Framework for Mental Health AI Chatbots. Available at: https://www.ncbi.nlm.nih.gov/pmc
[7] Linardon, J., et al. (2020). The Efficacy of App-Supported Smartphone Interventions for Mental Health Problems. World Psychiatry, 19(3), 293–307.
[8] Wei, J., et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Advances in Neural Information Processing Systems.
[9] Arndt, B.G., et al. (2017). Tethered to the EHR: Primary Care Physician Workload Assessment Using EHR Event Log Data and Time-Motion Observations. Annals of Family Medicine, 15(5), 419–426.
[10] Insel, T. & Cuthbert, B. (2015). Brain Disorders? Precisely. Science, 348(6234), 499–500. [Updated context: computational psychiatry advances 2024–2026, NPJ Mental Health Research.]
Appendix E: Curated Resource Library
Ethics and Governance
- IEACP Framework (MDPI Healthcare 2025): https://www.mdpi.com/journal/healthcare
- VERA-MH Framework (NIH/PMC 2025): https://www.ncbi.nlm.nih.gov/pmc
- APA Ethics Code: https://www.apa.org/ethics/code
- NASW AI Guidance: https://www.socialworkers.org
- HHS OCR HIPAA and AI: https://www.hhs.gov/ocr
Clinical Evidence
- JMIR Mental Health (ChatGPT Scoping Review): https://mental.jmir.org
- NPJ Mental Health Research: https://www.nature.com/npjmentalhealth
- Frontiers in Public Health (AI in Psychiatry): https://www.frontiersin.org/journals/public-health
- JMIR Medical Education: https://mededu.jmir.org
Market and Policy
- Analytics Insight (Market Data): https://www.analyticsinsight.net
- HITC (Health IT Consultant): https://www.hitconsultant.net
- American Telemedicine Association: https://www.americantelemed.org
- Nate's Newsletter on Mental Health AI: https://natesmentalhealth.substack.com
Prompt Engineering and Tools
- JMIR Prompt Engineering Tutorials: https://www.jmir.org
- Anthropic Claude Documentation: https://www.anthropic.com
- OpenAI Healthcare Use Policy: https://openai.com/policies
Professional Association Guidance
- APA Practice Guidelines on Technology: https://www.apa.org/practice/guidelines
- AAMFT Code of Ethics: https://www.aamft.org
- ACA Ethics Resources: https://www.counseling.org/knowledge-center/ethics
- NBCC Ethics Guidance: https://www.nbcc.org/ethics
This curriculum is a living document. Version 1.0, March 2026. Scheduled review: September 2026. Feedback and revisions from clinicians who complete this curriculum are actively welcomed and will inform future versions.
The work of ethical clinical practice has always required that practitioners stay current, stay humble, and stay human. In the age of AI, those requirements have not changed. They have intensified.
Conceived and directed by David Cooper, PsyD. Written by Claude (Anthropic).
More to explore
AI and Psychology: What are the big questions for 2025?
Key takeaways from the APA Mobile Health Tech Advisory Committee meeting on AI in mental health, exploring critical questions about the role of psychologists, AI tools, ethics, equity, and explainability.
How Digital Health Could Replace Your Therapist's Couch
Exploring how digital health solutions can expand access to mental health treatment, especially during times of crisis, and the opportunities for remote behavioral health interventions.
NoteBetter communication through prompting
How practicing precise prompts with LLMs can train clearer thinking and communication in everyday work.
Enjoyed this? Get new essays when they're published.