Thanks to advances in artificial intelligence, computers can now assist doctors in diagnosing disease and help monitor patient vital signs from hundreds of miles away.
Now, CU Boulder researchers are working to apply machine learning to psychiatry, with a speech-based mobile app that can categorize a patient’s mental health status as well as or better than a human can.
“We are not in any way trying to replace clinicians,” says Peter Foltz, a research professor at the Institute of Cognitive Science and co-author of a new paper in Schizophrenia Bulletin that lays out the promise and potential pitfalls of AI in psychiatry.
“But we do believe we can create tools that will allow them to better monitor their patients.”
Nearly one in five U.S. adults lives with a mental illness, many in remote areas where access to psychiatrists or psychologists is scarce.
Others can’t afford to see a clinician frequently, don’t have time or can’t get in to see one.
Even when a patient does make it in for an occasional visit, therapists base their diagnosis and treatment plan largely on listening to a patient talk – an age-old method that can be subjective and unreliable, notes paper co-author Brita Elvevåg, a cognitive neuroscientist at the University of Tromsø, Norway.
“Humans are not perfect. They can get distracted and sometimes miss out on subtle speech cues and warning signs,” Elvevåg says. “Unfortunately, there is no objective blood test for mental health.”
In pursuit of an AI version of that blood test, Elvevåg and Foltz teamed up to develop machine learning technology able to detect day-to-day changes in speech that hint at mental health decline.
For instance, sentences that don’t follow a logical pattern can be a critical symptom in schizophrenia.
Shifts in tone or pace can hint at mania or depression. And memory loss can be a sign of both cognitive and mental health problems.
“Language is a critical pathway to detecting patient mental states,” says Foltz. “Using mobile devices and AI, we are able to track patients daily and monitor these subtle changes.”
The new mobile app asks patients to answer a 5- to 10-minute series of questions by talking into their phone.
Among various other tasks, they’re asked about their emotional state, asked to tell a short story, listen to a story and repeat it and given a series of touch-and-swipe motor skills tests.
The new mobile app asks patients to answer a 5- to 10-minute series of questions by talking into their phone.
In collaboration with Chelsea Chandler, a computer science graduate student at CU Boulder, and other colleagues, they developed an AI system that assesses those speech samples, compares them to previous samples by the same patient and the broader population and rates the patient’s mental state.
In one recent study, the team asked human clinicians to listen to and assess speech samples of 225 participants – half with severe psychiatric issues; half healthy volunteers – in rural Louisiana and Northern Norway.
They then compared those results to those of the machine learning system.
“We found that the computer’s AI models can be at least as accurate as clinicians,” says Foltz.
He and his colleagues envision a day when AI systems they’re developing for psychiatry could be in the room with a therapist and a patient to provide additional data-driven insight, or serve as a remote-monitoring system for the severely mentally ill.
If the app detected a worrisome change, it could notify the patient’s doctor to check in.
“Patients often need to be monitored with frequent clinical interviews by trained professionals to avoid costly emergency care and unfortunate events,” says Foltz. ” But there are simply not enough clinicians for that.”
Foltz previously helped develop and commercialize an AI-based essay-grading technology which is now broadly used.
In their new paper, the researchers lay out a call to action for larger studies to prove efficacy and earn public trust before AI technology could be broadly brought into clinical practice for psychiatry.
“The mystery around AI does not nurture trustworthiness, which is critical when applying medical technology,” they write.
“Rather than looking for machine learning models to become the ultimate decision-maker in medicine, we should leverage the things that machines do well that are distinct from what humans do well.”
reventive strategies in young people at clinical high risk for psychosis [CHR-P (1)] can ameliorate the high personal, familial, societal, and clinical burden of psychotic disorders (2). CHR-P criteria, which include the ultra-high-risk state [e.g., at-risk mental state (3) or other psychosis-risk syndromes (4)] and/or basic symptoms (5), are detected by specialized clinical services (6) through established psychometric assessment tools (7), in the context of a clinical interview (8).
These tools are internationally validated (7) and assess whether the individual is meeting at least one of the three ultra-high-risk subgroups: attenuated psychotic symptoms (∼85% of cases), genetic risk and deterioration syndrome (5% of cases), or brief and limited intermittent psychotic symptoms (BLIPS, 10% of cases) (3, 9) subgroup.
Individuals at CHR-P recruited from help-seeking clinical samples have a 20% probability of developing emerging psychotic disorders (but not other nonpsychotic disorders (10, 11)) over 2 years (12).
This risk increases to 50% at 2 years for the BLIPS subgroup and to 89% at 5 years for the subset of BLIPS patients who present with seriously disorganizing and dangerous features (13).
Overall, the real-world potential impact of the CHR-P paradigm for improving the outcomes of psychotic disorders will be determined by the successful and stepped integration of three key components (Figure 1): (i) efficient detection of individuals at risk for psychosis, (ii) accurate prognosis of outcomes, and (iii) effective preventive treatment.

Core clinical components for effective prevention of psychosis. The first rate-limiting step for improving outcomes of psychosis through preventive approaches is the ability to accurately detect individuals at risk for psychosis. Adapted from (14), Creative Commons Attribution License (CC BY).
As illustrated in Figure 1, the first rate-limiting step for improving outcomes of psychosis through the CHR-P paradigm is the real-world ability to detect most individuals who are at risk for psychosis and will later develop it. Efficient detection of individuals at CHR-P has been a relatively neglected area of research in spite of the fact that inefficient detection impedes subsequent efforts.
In fact, even the most accurate prognostic model and effective preventive treatment would exert a modest impact if they are only applied to a small proportion of those who later develop psychosis.
The first challenge is that, to date, there has been an assumption that the CHR-P stage represents the prototypical prepsychotic stage for most individuals who will later go on to develop psychosis. However, in a thematic issue in Schizophrenia Bulletin titled “Dissecting the diagnostic pluripotentiality of the ultra high risk state for psychosis,” (Volume 44, Issue 2, 2018) (15–18), a meta-analysis demonstrated that the onset of psychosis may also occur via previously identified nonpsychotic clinical risk syndromes (17).
Separately, independent research groups have reported that first-episode psychosis (FEP) cases may occur without a prior identifiable period of subthreshold psychotic symptoms (19, 20).
The second challenge is that even assuming that the CHR-P concept would be sufficient to detect the majority of individuals at risk, its real-world penetrance is undetermined. Emerging evidence suggests that current detection strategies for identifying individuals at CHR-P are highly inefficient.
These strategies are largely based on referrals to specialized CHR-P clinics (6), made on suspicion of psychosis risk. Only 5% of individuals who had presented with a first onset of nonorganic psychosis to the local NHS Trust had been detected by one local CHR-P service (21). Since the service had been fully established in the same Trust, there is a clear need to improve the detection of at-risk cases (22).
To our best knowledge, there are no other original studies published to date reporting on the detection power of the CHR-P paradigm that could further validate or replicate these findings. Inefficient detection has important clinical implications.
For example, although the NHS England’s Access and Waiting Times-Standard for Early Intervention in psychosis (23) requires that CHR-P are detected nationwide and treated within 2 weeks, current detection strategies are inefficient.
A first viable alternative may be to intensify the outreach campaigns currently adopted by CHR-P clinics. Converging evidence has demonstrated that such an approach conflicts with the intrinsic psychometric limitations of the CHR-P interviews, producing a diluted transition risk (24, 25) and unreliable prognostic accuracy. Another option may be to implement front-line youth mental health services such as the Headspace initiative (other youth mental health services are available worldwide; for a recent review, see (26)).
Because of their one-stop-shop nature (26–28), youth-friendly services are expected to improve the attraction and detection of potential individuals who may be at risk of psychosis. Unfortunately, there are no original data reporting on the efficacy of detecting individuals at CHR-P through youth mental health services.
Rough estimates indicate only a modest improvement of detection when adopting broad youth mental health services, with 12% of individuals with FEP being detected at the time of their CHR-P phase (29) (Figure 2). Therefore, at present, between 88% (Headspace model) and 95% [Outreach and Support in South London (OASIS) model] of individuals who will later develop psychosis remain undetected at the time of their CHR-P stage (see Figure 2).

Detection power of at-risk patients who will later develop a first-episode of psychosis under different preventive programs: OASIS and headspace. CHR-P: Clinical High Risk for Psychosis. New figure.
In order to extend the preventive benefits of the CHR-P paradigm, more sophisticated and innovative approaches are urgently needed (30).
The current manuscript will review this issue in a comprehensive conceptual analysis of the current challenges and propose evidence-based ways for overcoming them. The detection program presented here integrates three separate approaches targeting different populations: secondary mental health care, primary care, and the community. The overarching methodology of this detection program leverages the recent advancements brought by clinical risk estimation tools (31) and digital approaches.
Source:
University of Colorado at Boulder
Media Contacts:
Lisa Marshall – University of Colorado at Boulder
Image Source:
The image is in the public domain.
Original Research: Closed access
“Using Machine Learning in Psychiatry: The Need to Establish a Framework That Nurtures Trustworthiness”. Chelsea Chandler, Peter W Foltz, Brita Elvevåg.
Schizophrenia Bulletin doi:10.1093/schbul/sbz105.