Mental Health Apps: The Evidence Review

More than 10,000 mental health apps are available in Europe — but clinical evidence supports only a fraction of them

The proliferation of digital mental health tools in Europe has accelerated sharply over the past decade, driven by a combination of rising demand for mental health support, persistent shortages in specialist care capacity, and the technical accessibility of smartphone-based delivery. The market now encompasses thousands of applications offering mindfulness exercises, mood tracking, cognitive behavioural therapy modules, peer support, and — increasingly — AI-driven conversational tools claiming therapeutic function. Within this broad landscape, the evidence base is uneven, the regulatory classification is contested, and the gap between marketing claims and clinical demonstration is frequently wide.

Precise figures for the European market are difficult to isolate from global estimates, but research consistently indicates that tens of thousands of mental health and wellness applications are available across major app stores — with figures above 20,000 commonly cited for global availability. The subset that explicitly claims therapeutic or clinical function, as distinct from general wellness support, is smaller but still substantial. The critical question — how many of these applications have been subjected to rigorous clinical evaluation — has a considerably less impressive answer.

What the Clinical Evidence Shows

The strongest evidence base for mental health applications concerns cognitive behavioural therapy-based tools — applications that deliver structured CBT protocols, including thought records, behavioural activation exercises, and cognitive restructuring techniques, through a digital interface. Several randomised controlled trials have examined applications in this category, and a systematic review conducted in 2024, adhering to Cochrane guidelines and PRISMA standards, examined the evidence for three CBT-based chatbot applications — Woebot, Wysa, and Youper — across ten included studies.

The review found clinically meaningful reductions in self-reported depression and anxiety symptoms across all three applications, as measured by validated instruments including the Patient Health Questionnaire-9 and the Generalised Anxiety Disorder-7 scale. Woebot, which uses a rule-based conversational interface to deliver CBT techniques, was the subject of five included studies. A 2017 randomised controlled trial published in JMIR Mental Health — still among the most-cited studies in the field — found that Woebot users showed significantly greater reductions in depression symptoms compared to a control group receiving information materials over a two-week period, with high engagement rates during the trial. Wysa, which serves users including those with chronic pain and maternal mental health challenges, showed comparable reductions in depression and anxiety severity in a 2024 RCT examining its use in people with chronic diseases.

The consistent finding across CBT-based applications is that they appear effective for subclinical and mild-to-moderate presentations of depression and anxiety — the population that constitutes the majority of unmet mental health need in EU health systems, where waiting times for psychological therapies can extend to months in many countries. Whether they are effective for moderate-to-severe presentations, or as standalone treatments for people who would otherwise access professional care, is a question the existing evidence cannot answer with confidence. Most trials have been short in duration — typically four to eight weeks — and have enrolled self-selected participants with relatively mild symptom levels. Long-term efficacy data is limited across the board.

The 2025 emergence of generative AI-based therapy applications — most prominently the Therabot RCT, the first randomised controlled trial of a generative AI therapy chatbot to treat anxiety and depression published in early 2025 — signals a shift in the technology landscape that existing regulatory and evidence frameworks have not yet caught up with. Rule-based CBT chatbots operate within prescribed parameters; generative AI applications generate novel responses to user inputs, creating both therapeutic possibilities and safety risks that require different evaluation approaches. The evidence base for this newer generation of applications is, at this stage, thin.

Regulatory Classification: When Does an App Become a Medical Device?

Under EU Medical Device Regulation 2017/745 (MDR), software intended to be used for medical purposes — including diagnosis, prevention, monitoring, or treatment of disease — qualifies as a medical device and is subject to conformity assessment requirements before it can be marketed. The classification is not contingent on the medium of delivery: an application claiming to diagnose depression or to deliver a clinically validated treatment protocol is, in regulatory terms, a medical device regardless of whether it is accessed via smartphone or browser.

In practice, the classification boundary between a medical device application and a wellness tool is contested and inconsistently applied. Applications that use language such as “evidence-based” or “clinically validated” in their marketing without having undergone MDR conformity assessment occupy an ambiguous position — and regulators across EU member states have been inconsistent in how they apply the classification criteria to digital mental health tools. The European Commission published guidance — MDCG 2025-4 — addressing the responsibilities of platform providers hosting medical device software, a step that reflects regulatory awareness of the enforcement gap but does not resolve the underlying classification challenge for borderline applications.

The consequence of this regulatory ambiguity is that mental health applications that function as medical devices in substantive terms may be marketed without the clinical evidence demonstration that MDR conformity assessment requires. Users — and clinicians considering recommending or integrating these tools — have limited means of distinguishing applications that have undergone rigorous evaluation from those that have not. The regulatory frameworks for health data established under the EHDS will, in the long term, create better data infrastructure for post-market surveillance of digital health tools, but that infrastructure is years from operational readiness.

A separate but related challenge concerns the classification of AI-driven therapeutic applications under the EU AI Act, which entered into force in August 2024. AI systems used for medical purposes that function as safety components of medical devices under the EU AI Act, or that are medical devices in themselves, are classified as high-risk AI systems under the Act’s Annex III. AI-driven therapeutic interventions that cross the threshold from wellness tool to clinical application thus face dual regulatory obligations under both MDR and the AI Act — a compliance landscape that is technically demanding and that smaller developers may struggle to navigate.

Data Privacy: The Underexamined Risk

Mental health data is among the most sensitive categories of personal information. Applications in this domain collect data including mood states, symptom severity records, therapeutic session content, crisis indicators, and — in applications using generative AI — open-ended disclosures that users make to conversational interfaces they may perceive as having therapeutic characteristics. The GDPR classifies health data as special category data subject to heightened protections under Article 9, and mental health application data falls squarely within this category.

The data practices of mental health applications in the EU have been examined in several studies and by data protection authorities in individual member states, with findings that raise concerns on multiple dimensions. Privacy policies are frequently incomplete or written at a reading level that makes meaningful informed consent difficult to achieve. Data sharing with third-party analytics providers and advertising platforms — common practice in consumer app development — creates disclosure risks that many users are unaware of. The distinction between the application developer and the data processor, relevant to GDPR accountability, is often unclear in consumer-facing disclosures.

Several high-profile cases in the United States — where mental health application companies have settled with the Federal Trade Commission over data sharing practices — have had limited regulatory parallel in Europe, partly because the data protection enforcement landscape remains fragmented despite the GDPR’s harmonisation ambitions. Whether the GDPR’s Article 9 protections are consistently applied to mental health app data in practice varies between member states and between the scale of enforcement action data protection authorities are able to bring.

What Patients and Clinicians Need to Know

For individuals considering mental health applications, several practical distinctions matter. CBT-based applications with published peer-reviewed evidence — including Woebot and Wysa — have a documented evidence base for mild-to-moderate depression and anxiety, though that evidence has meaningful limitations in duration and population representativeness. Applications that describe their approach as “clinically validated” without citing specific peer-reviewed studies or regulatory approvals warrant scepticism. The presence or absence of MDR conformity assessment is a meaningful quality signal, though it is rarely communicated clearly in consumer-facing materials.

For clinicians, the question of how to integrate mental health applications into care pathways — whether as between-session support tools, stepped-care options for patients on waiting lists, or adjuncts to ongoing therapy — lacks clear evidence-based guidance at the EU level. The WHO’s guidelines on digital interventions for health system strengthening, published in 2019 and not yet substantially updated for the current generation of AI-driven applications, provide a framework that emphasises evidence quality, safety, equity, and integration with existing health system structures, but the operational guidance for specific clinical decisions remains underdeveloped.

The structural argument for digital mental health tools in the EU context is straightforward: the treatment gap — the difference between the prevalence of mental health conditions and the proportion of people accessing effective care — is substantial across virtually all member states, and digital tools offer a scalable mechanism to address part of that gap without requiring proportional expansion of specialist workforce. The question is not whether digital tools have a role, but which tools, for whom, and under what governance and quality assurance conditions. Those conditions, across most of the EU, remain insufficiently defined.

What the Clinical Evidence Shows

Regulatory Classification: When Does an App Become a Medical Device?

Data Privacy: The Underexamined Risk

What Patients and Clinicians Need to Know

More in Digital Health

AI in European Healthcare: Promise, Regulation, and Reality

EHDS Progress Report: Where Does Europe’s Health Data Space Stand?

Closing Europe’s Digital Health Skills Gap — A Policy Roadmap