2. Measuring population mental health: Tools and current country practice

The frequent collection of population-level data on mental health outcomes is important for identifying populations at-risk for mental ill-health, for determining which socio-economic and other factors shape (and are shaped by) people’s mental health, and for designing effective prevention and promotion strategies. As outlined in Chapter 1, mental health is a multifaceted concept and exists beyond a binary distinction between the presence or absence of mental illness. Collecting data on both mental ill-health and positive mental health in population surveys and mental health assessments would yield a more complete picture of people’s overall mental health and help to better understand the drivers and policy levers associated with improving it.

However, the current lack of (internationally) standardised data on population mental health makes it difficult to assess the efficacy of different policy approaches across disparate contexts; standardising outcome measures is the first step in facilitating such analysis. This chapter outlines the tools available to data collectors, gives an overview of current data collection practices across OECD countries and offers suggestions for which outcomes to prioritise in international harmonisation efforts.

An analysis of responses to a questionnaire sent to official data producers in OECD countries in 2022 shows that all member states that answered are already active in this space. Prior to the pandemic, almost all OECD members were already collecting information on mental health outcomes in both health interviews and general household surveys, as well as via administrative data. COVID-19 has sparked additional interest in measuring population mental health, with many public agencies and statistical offices adding items to both new and existing surveys.

These existing data collections demonstrate the interest in, and relevance of, population mental health outcomes in a national statistics context. Yet there is room for improvement in several areas: the frequency of data collection; greater data availability across the full spectrum of both negative and positive mental health outcomes; and better harmonisation of measures across countries to improve international comparability.

Indeed, prior to the pandemic most mental health data were collected by countries on surveys that ran every four to ten years. While many introduced high-frequency surveys with mental health modules in the first two years of COVID-19, it is currently unclear whether these surveys will continue to be implemented moving forward. Further, although all statistical offices collect data on mental ill-health – with a particular focus on common mental disorders – general psychological distress and depressive symptoms tend to be captured through standardised screening tools, whereas measures of experiencing anxiety are less harmonised across countries. Data collection efforts for other mental conditions – such as post-traumatic stress disorder (PTSD), bipolar disorder, eating disorders, etc. – and for other aspects of mental health – such as suicidal ideation and mental health-related stigma – remain very uneven across countries. When it comes to positive mental health, cross-country comparative data are mainly limited to measures of life evaluation. Other aspects, such as affect and eudaimonia, are much less frequently collected as outcome measures, and when they are, the tools used are less likely to be standardised across countries.

The results of the OECD questionnaire suggest that existing data collection efforts are not capturing the full range of mental health outcomes – missing aspects of both mental ill-health as well as positive mental health. In order to capture these outcomes and collect frequent information on mental health, data collectors in OECD member countries could: (1) beyond screening tools focusing on symptoms of depression, expand use to those including symptoms of anxiety as outcome measures; (2) move towards collecting harmonised information on affective and eudaimonic aspects of positive mental health; and (3) explore using single-item questions on general mental health status across surveys.

While Chapter 1 focused on relevant types of outcomes (covering both mental ill-health as well as positive mental health) for data collectors interested in mental health, this chapter focuses mainly on the types of tools that can be used to measure these.

The broad tool types discussed in this chapter – some of which are sourced from administrative data, but the bulk of which come from household surveys – range from long survey modules to a battery of question items to single questions. Some tools can be used to capture aspects of either mental ill-health or positive mental health, while others are used only for specific types of outcomes. Each type of tool has its own advantages and disadvantages, requiring data collectors to select among them, depending on the needs and constraints of their specific contexts. The different tools are described below in order to provide a common understanding of the categorisation used in this report.

The chapter annexes contain in-depth information for readers interested in further details. Annex 2.A provides an overview of which specific tools are collected by each country, along with sample question framing and answer options. Annex 2.B lists full details, including question wording and scoring recommendations, for the most commonly used standardised instruments. More detailed reflections on the statistical quality of mental health survey measures are addressed in Chapter 3.

Administrative data can contain information on the use of mental health services, diagnoses of mental disorders in clinical settings, as well as cause of death data from suicide and substance abuse (i.e. drug overdoses and alcohol abuse).

While all of these can be considered objective (i.e. not self-reported) and easy-to-collect proxies of mental ill-health, measurement challenges remain. For instance, measures of service use and medical diagnoses do not capture population outcomes, but rather only those who are willing and able to access health care services. Such measures can overestimate comparative levels or incidence rates in countries with good (and affordable) medical systems, awareness programmes and less stigma, where people are more likely to both seek and receive treatment. In addition, preventing ill-health necessitates tracking outcomes prior to, and following, engagement with the service sector. This report does not consider administrative statistics related to health care further, referring readers to (OECD, 2021[1]).

Data on causes of death due to suicide or substance abuse (which are commonly referred to as “deaths of despair” (Case and Deaton, 2017[2])) do capture mental ill-health outcomes at the population level. These measures can act as proxies for severe mental illness and addiction. While there are social and cultural reasons affecting suicidal behaviours – meaning that not all suicides are the direct result of a mental ill-health – living with mental health conditions does substantially increase the risk of dying by suicide (OECD, 2021[1]). However, the registration of suicide deaths is a complex procedure, affected by factors such as how intent is ascertained, who completes the death certificate, and prevailing norms and stigma around suicide, all potentially affecting the cross-country comparability of mortality records (OECD, 2021[1]).

A general limitation for all types of administrative data is that the additional socio-demographic data collected alongside are often limited to the age, sex, geographic region and potentially the race/ethnicity of the deceased. This constrains the ability to delve into the drivers of mental health and to identify relevant socio-economic, environmental and relational risk and resilience factors.

In contrast to administrative data, population surveys generally contain information on respondents’ material conditions (e.g. income, wealth, labour market outcomes, housing quality), quality of life (e.g. physical health, educational attainment, environmental quality) and relationships (e.g. social connections, trust, safety). Population surveys can have a specific content focus, such as a health survey, or a more general scope, such as general social surveys. These surveys are conducted at the household level, with more in-depth modules on employment, health (including mental health), education, etc., administered to selected household members. Having a full range of well-being covariates is important to understand how mental health is impacted by, and how it in turn influences, other areas of people’s life. Furthermore, tracking (and eventually achieving) equity in mental health outcomes requires disaggregation by important socio-demographic categories.

Tools that have been included in household surveys to assess specific mental health outcomes range from single-item questions to standardised batteries of items. A brief description of each can be found below, with full details in Annex 2.A and Annex 2.B.

  • Questions about previous diagnoses – This refers to single-item questions about whether an individual has been diagnosed with a mental health disorder (e.g. major depressive disorder, generalised anxiety disorder, or other mental health conditions) by a health care worker, either in the past 12 months or over the course of his/her lifetime. These questions typically have yes/no answers and are not standardised across countries. For full details, see Table 2.6. Examples include:

    • “Have your mental health problems ever been diagnosed as a mental disorder by a professional (psychiatrist, doctor, clinical psychologist)? Yes / No”.

    • “Have you EVER been told by a doctor or other health professional that you had ...Any type of depression? Read if necessary: Some common types of depression include major depression (or major depressive disorder), bipolar depression, dysthymia, post-partum depression, and seasonal affective disorder. Yes / No”.

  • Questions about experienced symptoms – This refers to single-item questions about symptoms of mental disorders experienced in the past 12 months or over the course of an individual’s lifetime, without explicitly referring to a diagnosis by a medical professional. These questions typically have yes/no answers and are not standardised across countries. For full details, see Table 2.7. Examples include:

    • “During the past 12 months, have you had any of the following diseases or conditions? Depression (“Yes / No”).

    • “Have you ever suffered from chronic anxiety? ("Yes / No").

    • “Do you have a mood disorder? Yes / No”.

  • Questions about suicidal ideation and suicide attempts – These are (usually) single-item questions about a respondent’s experience of suicidal ideation, self-harm behaviours or suicide attempts. These questions typically have yes/no answers and are not standardised across countries. Recall periods refer to an individual’s lifetime, the last 12 months, the past two weeks, or “during COVID”. For full details, see Table 2.8. Examples include:

    • “Have you seriously contemplated suicide since the COVID-19 pandemic began? Yes/No”.

    • “Sometimes people harm themselves on purpose but they do not mean to take their life. In the past 12 months, did you ever harm yourself on purpose but not mean to take your life? Yes/No”.

    • “Have you ever attempted suicide? Yes/No”.

    • “Did you stay in a hospital overnight or longer because you tried to kill yourself? Yes/No”.

  • Questions about general mental health status – These refer to single-item questions on how respondents rate their mental health overall, and thus capture both components of ill-health and positive mental health. Questions are not standardised across countries and differ in terms of question wording, response options and recall period. For full details, see Table 2.9. Examples include:

    • “In general, how is your mental health? Excellent / Very good / Good / Fair / Poor”.

    • “Has your mental health/well-being been affected by the Covid-19 pandemic during the last 12 months?”

    • “On a scale from 1 to 10 can you indicate to what extent you are satisfied with your mental health? A score of 1 refers to completely dissatisfied and a 10 to completely satisfied”.

    • “Does your mental state interfere with your daily life at work? your family life? Yes / No”.

  • Positive mental health indicators – This refers to questions pertaining to the various aspects of positive mental health: life evaluation, affect (summary affect scales, and batteries of questions on positive, negative or mixed affect), eudaimonia (questions about quality of life, whether life is worthwhile or meaningful), as well as standardised positive mental health composite scales (combining different dimensions of positive mental health, prioritising positive over negative affect, and sometimes adding a social well-being component). In some instances, positive mental health indicators are single-item questions that vary across countries and surveys, while in others they are standardised batteries of questions. Standardisation across countries varies, with only life evaluation questions and positive mental health composite scales being consistently phrased. For full details, see Table 2.10. Specific question item phrasing and scoring suggestions for standardised composite scales can be found in Annex 2.B.

  • Screening tools – These refer to multi-item instruments designed to screen respondents for symptoms (rather than for diagnoses) of mental health conditions. These tools were initially developed in clinical settings to screen for common mental disorders to identify individuals who may be at risk and to flag them for further screening and potential diagnosis. They can be interviewer-led or self-administered and focus either on general psychological distress or on specific mental health conditions such as major depressive disorder, generalised anxiety disorder (and sometimes a combination of the two), alcohol use disorder, post-traumatic stress disorder, eating disorders and so on. These tools are considered “validated” in that they have been psychometrically tested for their validity (against the gold standard of structured interviews or diagnoses), sensitivity (the probability of correctly identifying a patient with the condition) and reliability (the measures produce consistent results when an individual is interviewed under a given set of circumstances) (refer to Chapter 3 for an extended discussion of statistical quality). A wide variety of screening tools are available, ranging from very short screeners of two items to longer instruments covering 20 items or more. The focus of questions varies between screening tools: all cover the frequency of experiencing (mostly negative) affect (i.e. feeling low, feeling nervous, feeling worthless), with some also including somatic symptoms (i.e. changed appetite, trouble sleeping) and/or functional impairment due to emotional distress (e.g. disturbance in daily activities, not being able to concentrate, not being able to stop worrying). Screening tools also differ in terms of reference period for symptoms, ranging from the past week to the past month; however, none are able to measure lifetime prevalence. Given these differences between screening tools, they are therefore not always directly comparable and should not be used interchangeably for international comparisons. Item scores are typically summarised in a summary index, with the final score being used either as a continuous measure of mental ill-health or to assess the risk of a common mental health conditions using a validated cut-off score. For full details, refer to Table 2.5. Exact question item wording and scoring recommendations for the most frequently used screening tools can be found in Annex 2.B.

  • Structured interviews – Structured interviews are considered the gold standard for measuring mental disorders (often both on a lifetime and 12-month basis). They provide a standardised assessment based on the internationally agreed definitions and criteria of recognised psychiatric classification systems and have strong diagnostic reliability and psychometric properties to determine whether or not a respondent has the condition of interest (Mueller and Segal, 2015[3]; Burger and Neeleman, 2007[4]).1 They are administered by trained interviewers, with close-ended and fully scripted questions and standardised scoring of responses (Ruedgers, 2001[5]). Structured interviews approximate assessments conducted by mental health professionals and in this way can identify populations at risk for mental health conditions even if these individuals have not been diagnosed by a health care professional. For additional information on the most commonly used structured interview, the Composite International Diagnostic Interview (CIDI), see Table 2.4 and Annex 2.B.

  • Additional mental-health related topics – This category refers to questions on any other relevant topics, including the use of mental health medication and services, the mental health of children and young people in the household, loneliness and stress, resilience and self-efficacy, attitudes towards mental health including stigma and literacy, and questions on unmet needs. For additional information, see Table 2.11.

All tools imply trade-offs in terms of response burden/ease and cost of data collection, accuracy and coverage (Figure 2.1, Table 2.1). Response burden is a direct function of how much time an individual needs to spend to provide information on their mental health status and how much stress is caused by providing this information. Accuracy refers to the sensitivity of a tool in correctly identifying a person with a mental health condition, whereas coverage entails whether the measure in question is applied to the full (adult) population.

By way of illustration, administrative data have a low response burden: they do not require answers from individual respondents and are routinely collected within a country’s data infrastructure. Yet statistics on deaths of despair focus only on the extreme end of mental ill-health and are further complicated by the fact that not all deaths of despair may be the culmination of a mental disorder. Furthermore, unlike household surveys, only those who were in contact with the health care system are captured by administrative records of diagnoses in a clinical setting.2

For household surveys, both the response burden and accuracy increase the longer and more specific a tool is: whereas single questions about experienced symptoms or a person’s general mental health status are short and easy to answer, they do not consider the nature or severity of symptoms, or the type of mental health condition, and have not been benchmarked against diagnostic criteria. Screening tools have been validated against the gold standard of structured interviews and are, depending on the specific tool used and the number of items covered, still relatively low cost in terms of response burden. However, they do not constitute a diagnosis from a health care professional and can only identify people likely at risk of disorders. Screening tools are validated against clinical diagnoses, and are thus designed to maximise likeness to diagnostic interviews to the extent possible. Still, when calibrating tools and cut-off scores, there is a trade-off between sensitivity (correctly identifying the presence of a mental health condition) and specificity (correctly noting the absence of a mental health condition), and researchers often prioritise the former rather than the latter, leading to slight overestimates by design (see Box 3.3 and Section 3.3.1 for a more detailed discussion). Finally, the majority of tools included in both household surveys and administrative data focus on mental ill-health; the only exceptions are household survey questions about general mental health and positive mental health.

The difference in question framing and item length – between structured interviews, screening tools and single-item questions on experienced symptoms or received diagnoses – can lead to different estimates of prevalence for the same reported outcome measure (Box 2.1). This speaks to the need for the standardisation of tool type (and transparency about which tool was used) when comparing outcomes across countries, over time and across population groups: i.e. mixing types of tools when commenting on outcomes like “share at risk for depression” or “share at risk for psychological distress” can lead to different estimates because of measurement differences, rather than because of differences in underlying mental health status (refer to Chapter 3 for an extended discussion of these themes).

In February and March of 2022, 37 of 38 OECD countries provided answers to a questionnaire designed by the OECD Secretariat to better understand what OECD countries are doing in terms of measuring mental health outcomes.3 The questionnaire covers the statistical tools used (questions about diagnoses, experienced symptoms, screening tools and structured interviews) and outcomes covered (mental ill-health, positive mental health and other related topics, including loneliness, stress, attitudes towards mental health, etc.). A discussion of mental health data related to service use and access to care is set out in A New Benchmark for Mental Health Systems (OECD, 2021[1]), and this new round of surveying seeks to build upon existing work by primarily focusing on mental health outcomes, rather than on service use or access to care, and in particular on outcomes that could be measured through household surveys rather than administrative data.

All OECD countries collect mortality statistics on causes of death, including from suicides rates as well as deaths from alcohol and drug overdoses. Statistics on causes of deaths are typically collected by hospitals or health care providers, while police authorities report deaths from suicides. The OECD already regularly publishes statistics for its member countries on both deaths from suicide and other types of deaths of despair (OECD, 2020[16]; OECD, 2021[17]).4

Administrative data on mental health go beyond death records. Hospital discharge registries that, depending on the country, may cover the length of hospitalisation and discharges by field of medical specialisation were mentioned by a number of countries, including Canada, Chile, Hungary, Italy, Slovenia, Switzerland and Türkiye. Some countries, including Spain and the United Kingdom, collect care or clinical care data to measure prevalence and incidence of specific behavioural disorders. The Swedish Social Insurance Agency also collects data on causes of work absences, with a special category for sick leave following a psychiatric diagnosis. Finally, a handful of countries collect administrative data on psychiatric medication. For example, in France the Agence nationale de sécurité du médicament (ANSM) publishes data on psychotropic drugs delivered to outpatients; Statistics Netherlands provides data on dispensed medicines, including those related to mental health conditions as determined by ATC (anatomical therapeutic chemical) coding; Australia collects administrative data on dispensed medications covered under the Pharmaceutical Benefits Scheme; and the Slovenian National Institute of Public Health (NIJZ) hosts data on prescription drug claims, including for mental health-related drugs.

In addition, all OECD countries that responded to the questionnaire reported collecting population-wide data on mental health outcomes through household surveys, already prior to the COVID-19 pandemic. While much of these data are collected through health interviews, 89% of countries reported also collecting mental health data in general social surveys (Figure 2.4). Some data on mental health are also collected through labour force surveys and special modules of the national census. Some countries also reported collecting mental health data in special surveys that focus on sub-populations, including Indigenous peoples, those in the criminal justice system and young people (see Box 2.2 for more information on the latter).

The pandemic has put mental health high on the national agenda for many OECD countries. As a result, most countries that answered the OECD questionnaire reported having ramped up data collection efforts on mental health in the months and years since March 2020. Around 68% of OECD countries reported collecting additional mental health data during the pandemic, either through new stand-alone surveys (43%) or by adding mental health and COVID-19 modules to existing surveys (35%) (see Table 2.3).5 Many of these new surveys are high-frequency, interviewing respondents weekly, biweekly, monthly or quarterly. However, it is unclear whether these surveys will continue in the future, or continue with the same frequency. Indeed, some COVID-specific surveys have already been discontinued by countries, while others that started off as weekly or monthly have since become less frequent (biweekly or quarterly).

Before 2020, only 22% of countries collected mental health data on surveys that ran annually or more frequently, and 11% on surveys that ran every two to three years. Returning to business as usual prior to the pandemic would mean that over half (51%) of countries collect mental health data every four to ten years. Such large gaps between survey rounds make it more difficult to track changes at the population-level (which as has been seen during the COVID-19 pandemic were sensitive to periods of intensifying COVID-19 deaths and strict confinement measures) and craft policy interventions accordingly.

All OECD countries collect data on both mental ill-health and positive mental health outcomes. For the former, there is much variety in terms of both the tools used and outcomes measured, whereas for the latter cross-country comparative data are mainly limited to measures of life evaluation (Figure 2.6); 59% of countries reported collecting data on affect, and only 24% on eudaimonia.

Mental ill-health outcome measures are captured through a variety of tools. The two tools most often reported by countries are screening tools and questions about experienced symptoms or disorders (either general or specific), with 97% and 78% of countries reporting using these types of tools in household surveys, respectively (Figure 2.7). Over half of countries (62%) ask single questions about people’s general mental health status. Many fewer countries report collecting data on previous diagnoses in household surveys (30%) or in structured interviews (16%).

Within the continuum of mental ill-health, existing measurement initiatives focus more on some forms of mental health issues than on others. Anxiety and depressive disorder are the most common mental health conditions affecting people in OECD countries (OECD/European Union, 2018[23]).While 86% of countries (32 out of 37) have a dedicated validated screening tool for measuring symptoms of depression, and 95% have one for general psychological distress (35 out of 37), only 41% rely on a screening tool for symptoms of anxiety (15 out of 37) (Figure 2.8). Screening tools used by countries vary widely in terms of item length, ranging from two to 40 questions (see Table 2.5).

Variants of the PHQ are the most common screening tool for measuring symptoms of depression, used by 84% (31 out of 37) of countries. The MHI-5 is the most common screening tool for general psychological distress, used by 76% of countries (28 out of 37). In both instances, this is largely driven by Eurostat, which harmonises the data collection efforts of European Union member countries: 26 of the 28 countries that rely on the MHI-5 participate in Eurostat, all but Australia and Israel.6 The PHQ-8 has been included in Eurostat’s European Health Interview Survey (EHIS), which is conducted every five to six years. Variants of the PHQ are also used by a number of non-European OECD countries (see Table 2.5).

OECD countries also collected data on symptoms of anxiety, although often through country-specific tools rather than validated screening tools (Figure 2.9). 70% of countries report capturing anxiety outcomes, through some combination of structured interviews, questions about previous diagnoses or about experience of anxiety disorders, affect data or validated screening tools. Considering all measurement tools included in surveys, more countries indicated using them primarily for measuring symptoms of depression. The only exceptions are questions about negative affect, for which usage is evenly divided: 30% of countries reported using negative affect to measure both anxiety (feeling nervous, anxious) and depression (feeling low, downhearted).

The focus of measurement initiatives on depressive and anxiety disorders reflects the fact that they are some of the most prevalent mental health conditions (OECD/European Union, 2018[23]), and that they contribute highly to the disease burden globally and in OECD countries (Santomauro et al., 2021[24]). Data collection efforts for other specific mental conditions – such as PTSD, bipolar disorder, eating disorders, etc. – remain very uneven across OECD countries (Figure 2.8).

Almost all OECD countries collect some data on life evaluation, primarily through a question on self-reported life satisfaction. Other aspects of positive mental health – affect and eudaimonia – are much less frequently covered by surveys undertaken by OECD countries; even when they are, the tools used are less standardised across countries (Figure 2.10). Measures of affect are more commonly collected than of eudaimonia; 59% of countries collect some form of affect data, through a combination of standardised composite scales and non-harmonised questions, while only 24% collect data on eudaimonia. In terms of standardised tools for measuring positive mental health outcomes, the SF-12 (and the SF-36 sub-component on energy and vitality, EVI), WHO-5 and either WEMWBS or its shorter form SWEMWBS are the three most common instruments; however, their overall use is still low: 30%, 16% and 19% of countries reported using each scale in a household survey, respectively.

Overall, data collection efforts on additional mental-health related topics (e.g. use of mental health medication and services; mental health of children and young people in the household; loneliness and stress; resilience and self-efficacy; attitudes towards mental health, including stigma and literacy; and questions on unmet needs) are also uneven across countries (see Table 2.11). Many of these issues are not yet well-defined conceptually, with few internationally standardised tools available. For instance, only 30% of countries reported collecting (very different) indicators covering the topics of mental health stigma, discrimination, literacy and knowledge of mental health issues and resources.7 However, some countries have recently launched new survey efforts – and developed new methods – given increased interest in mental health awareness. For instance, in 2021 Sweden’s Public Health Agency conducted an online population survey, covering more than 10 000 respondents, on knowledge and attitudes about mental illness and suicide (Public Health Agency Sweden, 2022[25]). After systematically reviewing more than 400 existing instruments for measuring mental health stigma and conducting cognitive testing, the Public Health Agency concluded that the overwhelmingly negative tone of existing measures was in itself stigmatising and focused mostly on examples of severe mental illness. They hence decided to develop their own survey: the final questionnaire included items that were designed as semantic differentials (word pairs) that captured both positive and negative perceptions of mental illness and focused on all forms of mental illness, including more common experiences of depression, anxiety and stress-related conditions (Public Health Agency Sweden, 2022[25]).

Measuring population mental health outcomes is not a new field for producers of official data in OECD countries, and many national statistical offices and health agencies were already collecting relevant data well before COVID-19. Nevertheless, it is also clear that there is room for improvement moving forward.

First, some aspects of mental health are measured more frequently than others, and there is scope for better cross-country harmonisation. The results of the OECD questionnaire to official data producers suggest that existing data collection efforts are not capturing the full range of mental health outcomes – missing aspects of both mental ill-health as well as positive mental health. While 86% of countries use a screening tool for symptoms of depression, and 95% for general psychological distress, only 41% use a standardised screening tool for symptoms of anxiety – and generalised anxiety disorder, along with mood disorders, is one of the most common mental health conditions affecting people in OECD countries. Data collection efforts for other specific mental conditions – such as post-traumatic stress disorder, bipolar disorder, eating disorders, etc. – remain very uneven across countries. When it comes to positive mental health, almost all countries gather some form of life evaluation data, but information about affect and eudaimonia is much less frequently collected (by 59% and 24% of countries, respectively), and often not in a standardised manner. Data producers could hence as a first step expand their use of screening tools to those that include symptoms of anxiety, as well as depression, and move towards more harmonisation for affective and eudaimonic aspects of positive mental health.

Second, it will be important to measure mental health outcomes regularly, and to keep up some of the momentum provided by the high frequency surveys with mental health modules initiated during the first two years of the pandemic. Given the trade-offs between response burden and accuracy that data producers face when choosing between different tools to measure mental health outcomes, adding a single question about people’s general mental health status to frequently conducted population surveys could be a way to gather this information regularly and help link data across surveys. Over half of countries (62%) already include such single items in surveys, though question wording varies widely. Canada has been an early leader in developing single-item self-reported mental health (SRMH) indicators, and its question formulation has already been adopted by Chile and Germany, which could make it a useful model for other countries moving forward. While questions about previous diagnoses received by health care professionals are also short, evidence suggests that they focus mostly on people who have been in touch with the health system and hence are better placed in health surveys only.

Chapter 3 reviews the available evidence on the statistical quality of these recommended tools in further detail and provides suggestions for three concrete measures that countries could adapt to maximise international harmonisation and minimise response burden.

Lastly, whichever results are communicated to policy makers or the general public, it is essential to be transparent as to which exact aspect of mental health is being measured, including which areas a specific tool covers and does not cover (e.g. only previous diagnosis? only affect, or also somatic symptoms, and if so, which ones?). This information is important to contextualise findings and to provide transparency as to any limitations that might impact the interpretation of results.

Composite International Diagnostic Interview (CIDI): The Composite International Diagnostic Interview (CIDI) is a comprehensive, fully-structured interview designed to be used by trained lay interviewers for the assessment of mental disorders according to the definitions and criteria of ICD-10 and DSM-IV (Kessler and Bedirhan Üstün, 2006[26]). A computer-assisted version of the interview is available along with a direct data entry software system that can be used to keypunch responses to the paper-and-pencil version of the interview. The CIDI is intended for use in epidemiological and cross-cultural studies as well as for clinical and research purposes. It allows investigators to measure the prevalence of lifetime and 12-month mental conditions, the severity and courses of these disorders, their impact on home management, work life, relationships and social life, and service and medications use. Several versions of the CIDI exist, but the latest version is the World Health Organization’s Composite International Diagnostic Interview (WHO-CIDI) V3.0 (Harvard Medical School, n.d.[27]). In total, the CIDI consists of a screening module and 40 sections, 22 of which are diagnostic sections to assess mood (two sections), anxiety (seven sections), substance abuse (two sections), childhood (four sections) and other disorders (seven sections). The remaining sections assess functioning and physical comorbidity, risk factors, socio-demographic information and the treatment of mental disorders. The screening module, which includes a series of introductory questions about the respondent’s general health before delving into the diagnostic stem questions, has been shown to increase the accuracy of diagnostic assessments by reducing the effects of respondent fatigue and unwillingness to disclose on stem question endorsement (Harvard Medical School, n.d.[27]).

The public health tools presented in this section focus mainly on royalty-free instruments, since fees and copyright restrictions might present a barrier to use.

Mental Health Inventory (MHI-5): The Mental Health Inventory-5 (MHI-5) is a five-item scale to screen for symptoms of psychological distress. It is drawn from the 38-item Mental Health Inventory (MHI) and included in the 20-item and 36-item versions of the Short Form Health Survey (SF-20 and SF-36) (Berwick et al., 1991[28]; Kelly et al., 2008[29]). The questions tap into both negative and positive affect, with three items focusing on low/depressed mood and two on nervousness/anxiety (although the tool itself is not used to present these aspects separately). The MHI-5 has been found to be a reliable measure of mental health status and has been validated against both depressive and, to a lesser degree, also anxiety disorders (including generalised anxiety and panic disorder) in general population and patient samples in a range of countries (Yamazaki, Fukuhara and Green, 2005[30]; Hoeymans et al., 2004[31]; Elovanio et al., 2020[32]; Gill et al., 2007[33]; Rumpf et al., 2001[34]; Strand et al., 2003[35]; Thorsen et al., 2013[36]). There is some evidence that removing the two anxiety-related items does not reduce the effectiveness of the MHI in detecting depression, although this has not been examined in studies in which a formal diagnosis according to clinical criteria was used as a gold standard (Yamazaki, Fukuhara and Green, 2005[30]).

The Short-Form Health Survey (SF-12): The Short-Form Health Survey (SF-12) is a tool to measure health-related quality of life. It was developed as a shorter alternative to the SF-36 questionnaire to be used in the general population and in large surveys and contains up to two items for each of the SF-36’s eight dimensions: general mental health, energy and fatigue, bodily pain, general health perceptions, limitations on physical activity due to health, limitations on social activity due to physical or emotional conditions, limitations on day-to-day activities due to physical health, and limitations on day-to-day activities due to emotional health (Ware et al., 2002[37]). A number of questions in both the SF-12 and SF-36 are taken directly from the Mental Health Inventory (MHI), which also features the MHI-5 free-standing scale in its own right (see above) (RAND, n.d.[38]). Two summary scores, the Physical Component Summary (PCS) and the Mental Component Summary (MCS), can be derived from the SF-12, and a range of scoring methods have been validated against both active and recent depressive disorders and to a lesser degree also anxiety disorders in general population samples (Ware et al., 2002[37]; Gill et al., 2007[33]; Vilagut et al., 2013[39]). Some evidence suggests that the association between the SF-12’s physical health dimensions might be more strongly related with mental health in low-income settings, with implications for context-specific weights (Ohrnberger et al., 2020[40]). The SF-12 is subject to copyright restrictions and can thus not be republished in this report (Quality Metric, n.d.[41]).

Kessler Scale (K10/ K6): The Kessler psychological distress scale, which is most often used in its 10-item (K10) and 6-item (K6) form, is a screening tool for identifying adults with significant levels of psychological distress. The questions focus on somatic symptoms and negative affect, particularly on both low-depressed mood and nervousness/anxiety. While these aspects are usually not presented separately and a total score for distress is usually used, factor analysis has established depression and anxiety as distinct clusters in the K10 (Brooks, Beard and Steel, 2006[42]). Indeed, although it is often applied in primary clinical settings as well, it was designed for use in the general population, and sensitivity and specificity analysis support both K6 and K10 as screening instruments to identify likely community cases of anxiety and depression (Slade, Grove and Burgess, 2011[43]). Furthermore, they have been extensively validated, including in cross-cultural settings, against diagnostic interview evaluations of anxiety and affective disorders, with lesser but significant associations with other mental disorder categories and with the presence of any current mental disorder (Andrews and Slade, 2001[44]). There is also some evidence that the Kessler scales can be used successfully (with lower cut-off scoring criteria) to capture individuals struggling with more moderate psychological distress that nonetheless warrants mental health intervention (Prochaska et al., 2012[45]).

General Health Questionnaire (GHQ-12): The 12-item General Health Questionnaire (GHQ-12) is a measure to detect psychological distress by focusing on affect (negative and positive), somatic symptoms and the functional impairment of respondents. The GHQ-12 has been translated into many languages and extensively validated in general and clinical populations worldwide (particularly against depression and anxiety disorders), including among adolescent samples (Hankins, 2008[48]; Gilbody, 2001[49]; Baksheev et al., 2011[50]). Originally intended as a unidimensional measure, there is some debate about the dimensionality of the GHQ-12, with many factor-analytical studies supporting a range of multidimensional structures (e.g. anxiety and depression, social dysfunction, loss of confidence) (Gao et al., 2004[51]). However, more recent evidence points to these results likely being an expression of method-specific variance caused by item wording, supporting the notion that treating the scale as a unitary construct would minimise bias (Hystad and Johnsen, 2020[52]). The GHQ-12 is subject to copyright restrictions and can thus not be republished in this report.

Patient Health Questionnaire (PHQ-9/ PHQ-8): The full Patient Health Questionnaire (PHQ) contains 59 questions, with modules focusing on mood, anxiety, alcohol, eating and somatoform disorders. The PHQ-9 is a nine-question survey designed to detect the presence and severity of depressive symptoms, and it directly maps onto the DSM-IV and DSM-5 symptom criteria for major depressive disorder. The PHQ-8 questionnaire removes the final question regarding suicidal ideation. While a one-factor structure for both the PHQ-8/9 has been identified, more recent studies support a two-factor model composed of affective and somatic factors (Sunderland et al., 2019[53]). Both instruments have shown acceptable diagnostic screening properties across various population and clinical settings, age groups, and cultures/ ethnicities, in addition to being also a reliable and valid measure of depression severity (Manea, Gilbody and McMillan, 2012[54]; Moriarty et al., 2015[55]; Kroenke et al., 2009[56]; Huang et al., 2006[57]; Kroenke, Spitzer and Williams, 2001[58]; Richardson et al., 2010[59]). The close alignment between the PHQ-8/9 and the DSM make it subject to the same criticism, including a potentially Western-focused construct of depression, relative to longer self-reported scales with less constrained symptom sets (Zimmerman et al., 2012[60]; Haroz et al., 2017[61]).

The Generalised Anxiety Disorder Questionnaire (GAD-7/GAD-2): The Generalised Anxiety Disorder Questionnaire (GAD-7) comprises seven questions about the frequency of broad anxiety-related problems in the past two weeks. It was developed for screening and severity assessment of Generalised Anxiety Disorder, and the items cover most but not all (symptoms of this disorder listed in the DSM-IV and 5 (excessive worry, difficulty to control the worry, restlessness and irritability but not e.g. fatigue, muscle tension, sleep disturbance). Research supports a unidimensional structure for the scale (Sunderland et al., 2019[53]). The GAD-7 has demonstrated good internal consistency, convergent validity, and sensitivity to change in both patient and population samples (Löwe et al., 2008[62]; Beard and Björgvinsson, 2014[63]). While the scale has been successfully translated into multiple languages and local dialects, more research on potential cross-cultural bias of the tool needs to be conducted (Parkerson et al., 2015[64]; Sunderland et al., 2019[53]). The scale focuses on general symptoms of anxiety and was not developed to assess the presence of other anxiety disorders, such as Social Anxiety Disorder. However, some researchers have argued that it can be used across different anxiety disorders, given the scale’s emphasis on the transdiagnostic process of worry and the fact that Generalised Anxiety Disorder has a high degree of comorbidity (Johnson et al., 2019[65]). The GAD-2 shorter version of this scale focuses only on the first two items (worry and difficulty to control the worry), i.e. the core criteria of generalised anxiety per the DSM. Available evidence has indicated support for its psychometric properties and validity in a range of settings (Byrd-Bredbenner, Eck and Quick, 2021[66]; Hughes et al., 2018[67]; Luo et al., 2019[68]; Ahn, Kim and Choi, 2019[69]).

Patient Health Questionnaire (PHQ-4): The PHQ-4 screening tool is a short, four-question tool to identify the presence and severity of core symptoms of both depression and anxiety, given that these are two of the most prevalent illnesses among the general population and often comorbid. The PHQ-4 pulls the two core depression-related questions from the PHQ-9/8 (which together are called the PHQ-2) plus two core anxiety-related questions from GAD-7 (which are called the GAD-2). Thus, the PHQ-4 is a combination of the PHQ-2 and GAD-2, which have independently been shown to be good, brief screening tools with construct and criterion validity (see above). Available evidence supports the PHQ-4’s psychometric properties, reliability and validity in studies focused on the general population, intervention, and workers and college students (Stanhope, 2016[71]; Khubchandani et al., 2016[72]; Löwe et al., 2010[73]).

Washington Group on Disability Statistics Short Set on Functioning – Enhanced (WG-SS): The Washington Group Short Set on Functioning – Enhanced (WG-SS Enhanced) was developed by the Washington Group on Disability Statistics, which is composed of representatives from National Statistics Offices, as well as UN agencies, international non-governmental organisations and organisations for people who are disabled, to capture not only the presence but also the type and severity of a respondent’s disability for use in population and special interest surveys (Washington Group on Disability Statistics, 2020[75]). Its focus is on functioning in the areas of seeing, hearing, walking or climbing stairs, remembering or concentrating, self-care, communication, upper body activities, as well as affect. The four questions on the latter focus on symptoms of depression and anxiety, though the questionnaire is not typically used in its subcomponent parts. Regardless, the focus on overall functioning might carry important ways forward for capturing transdiagnostic symptoms of mental ill-health.

Alcohol Use Disorders Identification Test/Concise (AUDIT/ AUDIT-C): The Alcohol Use Disorders Identification Test (AUDIT) is a 10-item alcohol screen developed by the WHO from the 1980s onwards that can help identify respondents or patients who are hazardous drinkers or have active alcohol use disorders (including alcohol abuse or dependence). Its validity has been demonstrated in settings beyond primary care, such as inpatient hospital wards, emergency departments, universities, workplaces, outpatient settings and psychiatric services (Berner et al., 2007[76]). Its short version of 3 items, designed to be integrated into routine patient interviews, has been found to have similar accuracy to the full-scale version and has been validated primarily in primary-care settings, as well as increasingly in more general population samples, including adults seeking online help with drinking (Bush et al., 1998[77]; Khadjesari et al., 2017[78]).

Core questions from the OECD Guidelines on Measuring Subjective Well-being: The OECD Guidelines on Subjective Well-being propose a minimal set of measures of subjective well-being covering both life evaluation and (short-term) affect that could be included in household surveys (OECD, 2013[80]). The core measures included are the ones which have the strongest evidence when it comes to validity and relevance, and for which international comparability is the most important. An experimental measure of an aspect of eudaimonic well-being is also included.

WHO-5 Well-being index (WHO-5): The World Health Organization Well-Being Index (WHO-5) is a short questionnaire of 5 items that focus on a respondent’s positive affect. The questionnaire, adapted from the longer WHO/ICD-10 Depression Diagnosis and DSM-IV Depression scale by selecting a subset of positively phrased items, has first been used in a project on well-being measures in primary health care by the WHO Regional Office in Europe in 1998 and since then has been translated into more than 30 languages (World Health Organization, 1998[81]; Topp et al., 2015[7]). The WHO-5 has been applied as a generic scale for well-being across a wide range of study fields and countries, as a sensitive screening tool for depression as well as an outcome measure in clinical trials (Topp et al., 2015[7]). Studies of younger and elderly persons indicated a unidimensional structure for this scale (Topp et al., 2015[7]).

SF-36 Energy/Vitality subscale: The 4-item vitality subscale of the larger SF-36 measure (see above) is a general measure of energy/fatigue. It has been validated in clinical settings and performed well compared to longer scales (e.g. for cancer-related fatigue) (Brown et al., 2011[83]).

Satisfaction with Life Scale (SWLS): The Satisfaction with Life Scale was developed to assess people’s satisfaction and evaluation of their lives as a whole, rather than focusing on specific life domains. Early studies have found it to show good convergent validity with other types of subjective well-being, while being distinct from affective well-being measures (Pavot et al., 1991[86]; Pavot and Diener, 1993[87]).

The Mental Health Continuum Short-Form (MHC-SF): The MHC-SF is a 14-item scale developed by Keyes to capture positive mental health in his dual-continuum model (Keyes, 2002[89]). It was derived from the 40-item Mental Health Continuum Long Form (MHC-LF), and consists of separate subscales: three “emotional well-being” items (reflecting affective well-being plus life satisfaction), five “social well-being” items, and six “psychological well-being” items (which when combined reflect eudaimonic well-being) (Lamers et al., 2011[90]). Studies have shown high internal and moderate test-retest reliability for the MHC-SF and confirmed the 3-factor structure of the subscales, which also show convergent validity with corresponding aspects of well-being and functioning (Lamers et al., 2011[90]).

The Warwick-Edinburgh Mental Well-Being Scale (WEMWBS): The 14-item WEMWBS scale was developed with funding from NHS Health Scotland in 2005 to measure mental well-being (conceived of as “both feeling good and functioning well”), taking the Affectometer 2 instrument as the starting point (Warwick Medical School, 2021[91]). Some studies confirmed a unidimensional structure for WEMWBS, while others identified three residual factors relating to affective well-being, psychological functioning or eudaimonia, and social relationships (Shannon et al., 2020[92]; Koushede et al., 2019[93]). A shorter, 7-item version of the scale, SWEMWBS, is also available, focusing slightly less on affect (Stewart-Brown et al., 2009[94]). (S)WEMWBS has been validated in various populations and among different subgroups, including adolescents, clinical samples and ethnic minority samples, and has been translated into more than 25 languages and validated in Norwegian, Swedish, Italian, Dutch, Danish, German, French and Spanish. Both scales have been shown to be sensitive to changes that occur in mental well-being promotion and mental illness treatment and prevention projects (Koushede et al., 2019[93]). Both instruments can distinguish mental well-being between subgroups, but SWEMBS has been found to be less sensitive than the longer version to gender differences (Koushede et al., 2019[93]; Ng Fat et al., 2017[95]).

References

[46] ABS (2007), Information Paper: Use of the Kessler Psychological Distress Scale in ABS Health Surveys, Australian Bureau of Statistics, https://www.abs.gov.au/ausstats/[email protected]/lookup/4817.0.55.001chapter92007-08.

[69] Ahn, J., Y. Kim and K. Choi (2019), “The psychometric properties and clinical utility of the Korean version of GAD-7 and GAD-2”, Frontiers in Psychiatry, Vol. 127/10, https://doi.org/10.3389/fpsyt.2019.00127.

[44] Andrews, G. and T. Slade (2001), “Interpreting scores on the Kessler Psychological Distress Scale (K10)”, Australian and New Zealand Journal of Public Health, Vol. 25/6, pp. 494-497, https://doi.org/10.1111/j.1467-842X.2001.tb00310.x.

[50] Baksheev, G. et al. (2011), “Validity of the 12-item General Health Questionnaire (GHQ-12) in detecting depressive and anxiety disorders among high school students”, Psychiatry Research, Vol. 187/1-2, pp. 291-296, https://doi.org/10.1016/j.psychres.2010.10.010.

[63] Beard, C. and T. Björgvinsson (2014), “Beyond generalized anxiety disorder: Psychometric properties of the GAD-7 in a heterogeneous psychiatric sample”, Journal of Anxiety Disorders, Vol. 28/6, pp. 547-552, https://doi.org/10.1016/j.janxdis.2014.06.002.

[76] Berner, M. et al. (2007), “The Alcohol Use Disorders Identification Test for Detecting At-Risk Drinking: A Systematic Review and Meta-Analysis”, Journal of Studies on Alcohol and Drugs, Vol. 68/3, pp. 461-473, https://doi.org/10.15288/jsad.2007.68.461.

[28] Berwick, D. et al. (1991), “Performance of a five-item mental health screening test”, Medical Care, Vol. 29/2, pp. 169-176, https://doi.org/10.1097/00005650-199102000-00008.

[42] Brooks, R., J. Beard and Z. Steel (2006), “Factor structure and interpretation of the K10”, Psychological Assessment, Vol. 18/1, pp. 62-70, https://doi.org/10.1037/1040-3590.18.1.62.

[83] Brown, L. et al. (2011), “Comparison of SF-36 vitality scale and Fatigue Symptom Inventory in assessing cancer-related fatigue”, Supportive Care in Cancer, Vol. 19/8, pp. 1255-1259, https://doi.org/10.1007/s00520-011-1148-2.

[4] Burger, H. and J. Neeleman (2007), “A glossary on psychiatric epidemiology”, Journal of Epidemiology and Community Health, Vol. 61/3, pp. 185-189, https://doi.org/10.1136/jech.2003.019430.

[77] Bush, K. et al. (1998), “The AUDIT alcohol consumption questions (AUDIT-C): An effective brief screening test for problem drinking”, Archives of Internal Medicine, Vol. 158/16, https://doi.org/10.1001/archinte.158.16.1789.

[66] Byrd-Bredbenner, C., K. Eck and V. Quick (2021), “GAD-7, GAD-2, and GAD-mini: Psychometric properties and norms of university students in the United States”, General Hospital Psychiatry, Vol. 69, pp. 61-66, https://doi.org/10.1016/j.genhosppsych.2021.01.002.

[2] Case, A. and A. Deaton (2017), Mortality and Morbidity in the 21st Century, Brookings Papers on Economic Activity, http://brookings.edu/wp-content/uploads/2017/08/casetextsp17bpea.pdf.

[21] Chile, G. (2021), Saludable Mente [Healthy Mind], https://www.gob.cl/saludablemente/.

[88] Diener, E. et al. (1985), “The Satisfaction with Life Scale”, Journal of Personality Assessment, Vol. 49/1, pp. 71-75, https://doi.org/10.1207/s15327752jpa4901_13.

[9] Dobson, K. et al. (2020), “Trends in the prevalence of depression and anxiety disorders among Canadian working-age adults between 2000 and 2016”, Health Reports, Vol. 31/12, pp. 12-23, https://doi.org/10.25318/82-003-X202001200002-ENG.

[85] Donovan, K. et al. (2008), “Identifying clinically meaningful fatigue with the fatigue symptom inventory”, Journal of Pain and Symptom Management, Vol. 36/5, pp. 480-487, https://doi.org/10.1016/j.jpainsymman.2007.11.013.

[32] Elovanio, M. et al. (2020), “General Health Questionnaire (GHQ-12), Beck Depression Inventory (BDI-6), and Mental Health Index (MHI-5): Psychometric and predictive properties in a Finnish population-based sample”, Psychiatry Research, Vol. 289, p. 112973, https://doi.org/10.1016/j.psychres.2020.112973.

[51] Gao, F. et al. (2004), “Does the 12-item General Health Questionnaire contain multiple factors and do we need them?”, Health and Quality of Life Outcomes, Vol. 2/1, p. 63, https://doi.org/10.1186/1477-7525-2-63.

[49] Gilbody, S. (2001), “Routinely administered questionnaires for depression and anxiety: systematic review”, BMJ, Vol. 322/7283, pp. 406-409, https://doi.org/10.1136/bmj.322.7283.406.

[33] Gill, S. et al. (2007), “Validity of the mental health component scale of the 12-item Short-Form Health Survey (MCS-12) as measure of common mental disorders in the general population”, Psychiatry Research, Vol. 152/1, pp. 63-71, https://doi.org/10.1016/j.psychres.2006.11.005.

[48] Hankins, M. (2008), “The reliability of the twelve-item general health questionnaire (GHQ-12) under realistic assumptions”, BMC Public Health, Vol. 8/1, p. 355, https://doi.org/10.1186/1471-2458-8-355.

[61] Haroz, E. et al. (2017), “How is depression experienced around the world? A systematic review of qualitative literature”, Social Science & Medicine, Vol. 183, pp. 151-162, https://doi.org/10.1016/j.socscimed.2016.12.030.

[27] Harvard Medical School (n.d.), About the WHO WMH-CIDI, https://www.hcp.med.harvard.edu/wmhcidi/about-the-who-wmh-cidi/ (accessed on 31 May 2022).

[20] HHS (2021), U.S. Surgeon General Issues Advisory on Youth Mental Health Crisis Further Exposed by COVID-19 Pandemic, United States Department of Health and Human Services (HHS), https://www.hhs.gov/about/news/2021/12/07/us-surgeon-general-issues-advisory-on-youth-mental-health-crisis-further-exposed-by-covid-19-pandemic.html.

[31] Hoeymans, N. et al. (2004), “Measuring mental health of the Dutch population: A comparison of the GHQ-12 and the MHI-5”, Health and Quality of Life Outcomes, Vol. 2/23, https://doi.org/10.1186/1477-7525-2-2.

[57] Huang, F. et al. (2006), “Using the Patient Health Questionnaire-9 to measure depression among racially and ethnically diverse primary care patients”, Journal of General Internal Medicine, Vol. 21/6, pp. 547-552, https://doi.org/10.1111/j.1525-1497.2006.00409.x.

[67] Hughes, A. et al. (2018), “Diagnostic and clinical utility of the GAD-2 for screening anxiety symptoms in individuals with multiple sclerosis”, Archives of Physical Medicine and Rehabilitation, Vol. 99/10, pp. 2045-2049, https://doi.org/10.1016/j.apmr.2018.05.029.

[52] Hystad, S. and B. Johnsen (2020), “The Dimensionality of the 12-Item General Health Questionnaire (GHQ-12): Comparisons of Factor Structures and Invariance Across Samples and Time”, Frontiers in Psychology, Vol. 11, https://doi.org/10.3389/fpsyg.2020.01300.

[65] Johnson, S. et al. (2019), “Psychometric properties of the General Anxiety Disorder 7-Item (GAD-7) scale in a heterogeneous psychiatric sample”, Frontiers in Psychology, Vol. 10, https://doi.org/10.3389/fpsyg.2019.01713.

[29] Kelly, M. et al. (2008), “Evaluating cutpoints for the MHI-5 and MCS using the GHQ-12: A comparison of five different methods”, BMC Psychiatry, Vol. 8/10, https://doi.org/10.1186/1471-244X-8-10.

[26] Kessler, R. and T. Bedirhan Üstün (2006), “The World Mental Health (WMH) survey initiative version of the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI)”, International Journal of Methods in Psychiatric Research, Vol. 13/2, pp. 93-121, https://doi.org/10.1002/mpr.168.

[47] Kessler, R. et al. (2010), “Screening for serious mental illness in the general population with the K6 screening scale: results from the WHO World Mental Health (WMH) survey initiative”, International Journal of Methods in Psychiatric Research, Vol. 19/S1, pp. 4-22, https://doi.org/10.1002/mpr.310.

[89] Keyes, C. (2002), “The mental health continuum: From languishing to flourishing in life”, Journal of Health and Social Behavior, Vol. 43/2, pp. 207-222, https://doi.org/10.2307/3090197.

[78] Khadjesari, Z. et al. (2017), “Validation of the AUDIT-C in adults seeking help with their drinking online”, Addiction Science & Clinical Practice, Vol. 12/1, p. 2, https://doi.org/10.1186/s13722-016-0066-5.

[72] Khubchandani, J. et al. (2016), “The psychometric properties of PHQ-4 depression and anxiety screening scale among college students”, Archives of Psychiatric Nursing, Vol. 30/4, pp. 457-462, https://doi.org/10.1016/j.apnu.2016.01.014.

[10] KOSIS (n.d.), Annual prevalence of mental disorders (adjusted for sex and age) (database), Korean Statistical Information Service, https://kosis.kr/statHtml/statHtml.do?orgId=117&tblId=TX_117_2009_HB027&conn_path=I2 (accessed on 16 May 2022).

[11] KOSIS (n.d.), Depressive disorder prevalence (database), National Health and Nutrition Survey, Korean Statistical Information Service, https://knhanes.kdca.go.kr/knhanes/sub01/sub01_05.do#none (accessed on 31 August 2022).

[93] Koushede, V. et al. (2019), “Measuring mental well-being in Denmark: Validation of the original and short version of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS and SWEMWBS) and cross-cultural comparison across four European settings”, Psychiatry Research, Vol. 271, pp. 502-509, https://doi.org/10.1016/j.psychres.2018.12.003.

[58] Kroenke, K., R. Spitzer and J. Williams (2001), “The PHQ-9: Validity of a brief depression severity measure”, Journal of General Internal Medicine, Vol. 16/9, https://doi.org/10.1046/j.1525-1497.2001.016009606.x.

[74] Kroenke, K. et al. (2009), “An ultra-brief screening scale for anxiety and depression: The PHQ-4”, Psychosomatics, Vol. 50/6, pp. 613-621, https://doi.org/10.1176/APPI.PSY.50.6.613.

[56] Kroenke, K. et al. (2009), “The PHQ-8 as a measure of current depression in the general population”, Journal of Affective Disorders, Vol. 114/1-3, pp. 163-173, https://doi.org/10.1016/j.jad.2008.06.026.

[90] Lamers, S. et al. (2011), “Evaluating the psychometric properties of the Mental Health Continuum-Short Form (MHC-SF)”, Journal of Clinical Psychology, Vol. 67/1, pp. 99-110, https://doi.org/10.1002/jclp.20741.

[62] Löwe, B. et al. (2008), “Validation and standardization of the Generalized Anxiety Disorder screener (GAD-7) in the general population”, Medical Care, Vol. 46/3, pp. 266-274, https://doi.org/10.1097/mlr.0b013e318160d093.

[73] Löwe, B. et al. (2010), “A 4-item measure of depression and anxiety: Validation and standardization of the Patient Health Questionnaire-4 (PHQ-4) in the general population”, Journal of Affective Disorders, Vol. 122/1-2, pp. 86-95, https://doi.org/10.1016/j.jad.2009.06.019.

[68] Luo, Z. et al. (2019), “Adaptation of the two-item generalized anxiety disorder scale (GAD-2) to Chinese rural population: A validation study and meta-analysis”, General Hospital Psychiatry, Vol. 60, pp. 50-56, https://doi.org/10.1016/j.genhosppsych.2019.07.008.

[54] Manea, L., S. Gilbody and D. McMillan (2012), “Optimal cut-off score for diagnosing depression with the Patient Health Questionnaire (PHQ-9): A meta-analysis”, Canadian Medical Association Journal, Vol. 184/3, pp. E191-E196, https://doi.org/10.1503/cmaj.110829.

[55] Moriarty, A. et al. (2015), “Screening and case finding for major depressive disorder using the Patient Health Questionnaire (PHQ-9): A meta-analysis”, General Hospital Psychiatry, Vol. 37/6, pp. 567-576, https://doi.org/10.1016/j.genhosppsych.2015.06.012.

[3] Mueller, A. and D. Segal (2015), “Structured versus Semistructed versus Unstructured Interviews”, in Cautin, R. and S. Lilienfeld (eds.), The Encyclopedia of Clinical Psychology, John Wiley & Sons, Inc, https://www.researchgate.net/profile/Daniel-Segal-6/publication/313966231_Structured_versus_Semistructured_versus_Unstructured_Interviews/links/599a13bcaca272e41d3ec727/Structured-versus-Semistructured-versus-Unstructured-Interviews.pdf (accessed on 30 March 2022).

[6] National Academies of Sciences Engineering and Medicine (2021), Mental Health, Substance Use, and Wellbeing in Higher Education: Supporting the Whole Student, The National Academies Press, Washington, D.C., https://doi.org/10.17226/26015.

[13] National Center for Health Statistics (2021), Estimates of Mental Health Symptomatology, by Month of Interview: United States, 2019 (database), U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, https://www.cdc.gov/nchs/data/nhis/mental-health-monthly-508.pdf (accessed on 16 May 2022).

[95] Ng Fat, L. et al. (2017), “Evaluating and establishing national norms for mental wellbeing using the short Warwick–Edinburgh Mental Well-being Scale (SWEMWBS): Findings from the Health Survey for England”, Quality of Life Research, Vol. 26/5, pp. 1129-1144, https://doi.org/10.1007/s11136-016-1454-8.

[1] OECD (2021), A New Benchmark for Mental Health Systems: Tackling the Social and Economic Costs of Mental Ill-Health, OECD Health Policy Studies, OECD Publishing, Paris, https://doi.org/10.1787/4ed890f6-en.

[18] OECD (2021), COVID-19 and Well-being: Life in the Pandemic, OECD Publishing, Paris, https://doi.org/10.1787/1e1ecb53-en.

[17] OECD (2021), Health at a Glance, OECD Publishing, Paris, https://doi.org/10.1787/19991312.

[19] OECD (2021), Supporting young people’s mental health through the COVID-19 crisis, OECD Policy Responses to Coronavirus (COVID-19), OECD Publishing, Paris, https://www.oecd.org/coronavirus/policy-responses/supporting-young-people-s-mental-health-through-the-covid-19-crisis-84e143e5/.

[16] OECD (2020), How’s Life? 2020: Measuring Well-being, OECD Publishing, Paris, https://doi.org/10.1787/9870c393-en.

[80] OECD (2013), OECD Guidelines on Measuring Subjective Well-being, OECD Publishing, Paris, https://doi.org/10.1787/9789264191655-en.

[23] OECD/European Union (2018), Health at a Glance: Europe 2018: State of Health in the EU Cycle, OECD Publishing, Paris/European Union, Brussels, https://doi.org/10.1787/health_glance_eur-2018-en.

[40] Ohrnberger, J. et al. (2020), “Validation of the SF12 mental and physical health measure for the population from a low-income country in sub-Saharan Africa”, Health and Quality of Life Outcomes, Vol. 18/78, https://doi.org/10.1186/s12955-020-01323-1.

[64] Parkerson, H. et al. (2015), “Cultural-based biases of the GAD-7”, Journal of Anxiety Disorders, Vol. 31, pp. 38-42, https://doi.org/10.1016/j.janxdis.2015.01.005.

[87] Pavot, W. and E. Diener (1993), “Review of the Satisfaction With Life Scale.”, Psychological Assessment, Vol. 5/2, pp. 164-172, https://doi.org/10.1037/1040-3590.5.2.164.

[86] Pavot, W. et al. (1991), “Further validation of the Satisfaction with Life Scale: evidence for the cross-method convergence of well-being measures”, Journal of Personality Assessment, Vol. 57/1, pp. 149-161, https://doi.org/10.1207/s15327752jpa5701_17.

[45] Prochaska, J. et al. (2012), “Validity study of the K6 scale as a measure of moderate mental distress based on mental health treatment need and utilization”, International Journal of Methods in Psychiatric Research, Vol. 21/2, pp. 88-97, https://doi.org/10.1002/mpr.1349.

[25] Public Health Agency Sweden (2022), Synen på psykisk ohälsa och suicid, https://www.folkhalsomyndigheten.se/publicerat-material/publikationsarkiv/s/synen-pa-psykisk-ohalsa-och-suicid-/?pub=105538#105580.

[41] Quality Metric (n.d.), The SF-12v2 PRO Health Survey, https://www.qualitymetric.com/health-surveys-old/the-sf-12v2-health-survey/ (accessed on 10 May 2022).

[38] RAND (n.d.), Mental Health Inventory Survey, https://www.rand.org/health-care/surveys_tools/mos/mental-health.html (accessed on 10 May 2022).

[59] Richardson, L. et al. (2010), “Evaluation of the Patient Health Questionnaire-9 item for detecting major depression among adolescents”, Pediatrics, Vol. 126/6, pp. 1117-1123, https://doi.org/10.1542/peds.2010-0852.

[5] Ruedgers, R. (2001), Handbook of diagnositc and structured interviewing, Guilford Press, https://psycnet.apa.org/record/2001-05047-000.

[34] Rumpf, H. et al. (2001), “Screening for mental health: validity of the MHI-5 using DSM-IV Axis I psychiatric disorders as gold standard”, Psychiatry Research, Vol. 105/3, pp. 243-253, https://doi.org/10.1016/S0165-1781(01)00329-8.

[12] SAMHSA (2019), Key Substance Use and Mental Health Indicators in the United States: Results from the 2018 National Survey on Drug Use and Health (database), Substance Abuse and Mental Health Services Administration, Rockville, MD, https://www.samhsa.gov/data/sites/default/files/cbhsq-reports/NSDUHNationalFindingsReport2018/NSDUHNationalFindingsReport2018.pdf (accessed on 16 May 2022).

[82] Sándor, E. et al. (2021), Impact of COVID-19 on young people in the EU, Publications Office of the European Union, https://www.eurofound.europa.eu/publications/report/2021/impact-of-covid-19-on-young-people-in-the-eu#:~:text=COVID%2D19%20pandemic.-,Young%20people%20were%20more%20likely%20than%20older%20groups%20to%20experience,home%20requirements%20and%20school%20closures.

[22] Santé Publique France (2021), La santé mentale au temps de la COVID-19 : En parler, c’est déjà se soigner [Mental health in the time of COVID-19: Talking about it is already treating yourself], Santé Puique France, https://www.santepubliquefrance.fr/presse/2021/la-sante-mentale-au-temps-de-la-covid-19-en-parler-c-est-deja-se-soigner.

[24] Santomauro, D. et al. (2021), “Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic”, The Lancet, Vol. 398/10312, pp. 1700-1712, https://doi.org/10.1016/S0140-6736(21)02143-7.

[92] Shannon, S. et al. (2020), “Testing the factor structure of the Warwick-Edinburgh Mental Well-Being Scale in adolescents: A bi-factor modelling methodology”, Psychiatry Research, Vol. 293, https://doi.org/10.1016/j.psychres.2020.113393.

[43] Slade, T., R. Grove and P. Burgess (2011), “Kessler Psychological Distress Scale: Normative Data from the 2007 Australian National Survey of Mental Health and Wellbeing”, Australian & New Zealand Journal of Psychiatry, Vol. 45/4, pp. 308-316, https://doi.org/10.3109/00048674.2010.543653.

[70] Spitzer, R. et al. (2006), “A brief Measure for assessing Generalized Anxiety Disorder: The GAD-7”, Archives of Internal Medicine, Vol. 166/10, pp. 1092-1097, https://doi.org/10.1001/ARCHINTE.166.10.1092.

[71] Stanhope, J. (2016), “Patient Health Questionnaire-4”, Occupational Medicine, Vol. 66/9, pp. 760-761, https://doi.org/10.1093/occmed/kqw165.

[8] Statistics Canada (2013), Canadian Community Health Survey: Mental Health, 2012, The Daily, https://www150.statcan.gc.ca/n1/daily-quotidien/130918/dq130918a-eng.htm (accessed on 16 May 2022).

[94] Stewart-Brown, S. et al. (2009), “Internal construct validity of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS): a Rasch analysis using data from the Scottish Health Education Population Survey”, Health and Quality of Life Outcomes, Vol. 7/1, p. 15, https://doi.org/10.1186/1477-7525-7-15.

[35] Strand, B. et al. (2003), “Measuring the mental health status of the Norwegian population: A comparison of the instruments SCL-25, SCL-10, SCL-5 and MHI-5 (SF-36)”, Nordic Journal of Psychiatry, Vol. 57/2, pp. 113-118, https://doi.org/10.1080/08039480310000932.

[53] Sunderland, M. et al. (2019), “Self-Report Scales for Common Mental Disorders”, in The Cambridge Handbook of Clinical Assessment and Diagnosis, Cambridge University Press, https://doi.org/10.1017/9781108235433.019.

[36] Thorsen, S. et al. (2013), “The predictive value of mental health for long-term sickness absence: the Major Depression Inventory (MDI) and the Mental Health Inventory (MHI-5) compared”, BMC Medical Research Methodology, Vol. 13/115, https://doi.org/10.1186/1471-2288-13-115.

[7] Topp, C. et al. (2015), “The WHO-5 well-being index: A systematic review of the literature”, Psychotherapy and Psychosomatics, Vol. 84/3, pp. 167-176, https://doi.org/10.1159/000376585.

[15] University of Essex, Institute for Social and Economic Research (2022), Understanding Society: Waves 1-11, 2009-2020 and Harmonised BHPS: Waves 1-18, 1991-2009 (database), 5th Edition. UK Data Service., https://doi.org/10.5255/UKDA-SN-6614-16 (accessed on 10 June 2022).

[39] Vilagut, G. et al. (2013), “The mental component of the Short-Form 12 health survey (SF-12) as a measure of depressive disorders in the general population: Results with three alternative scoring methods”, Value in Health, Vol. 16/4, pp. 564-573, https://doi.org/10.1016/j.jval.2013.01.006.

[84] Ware, J. et al. (1993), SF-36 Health Survey: Manual and Interpretation Guide, The Health Institute, New England Medical Center Hospitals, https://www.researchgate.net/profile/John-Ware-6/publication/313050850_SF-36_Health_Survey_Manual_Interpretation_Guide/links/594a5b83aca2723195de5c3d/SF-36-Health-Survey-Manual-Interpretation-Guide.pdf (accessed on 22 January 2023).

[37] Ware, J. et al. (2002), How to score SF-12 items: How to score version 2 of the SF-12 Health Survey, https://www.researchgate.net/publication/291994160.

[91] Warwick Medical School (2021), The Warwick-Edinburgh Mental Wellbeing Scale (WEMWBS), https://warwick.ac.uk/fac/sci/med/research/platform/wemwbs/.

[75] Washington Group on Disability Statistics (2020), The Washington Group Short Set on Functioning: Enhanced (WG-SS Enhanced), https://www.washingtongroup-disability.com/fileadmin/uploads/wg/Documents/Washington_Group_Questionnaire__3_-_WG_Short_Set_on_Functioning_-_Enhanced.pdf.

[14] Woodhead, C. et al. (2012), “Impact of exposure to combat during deployment to Iraq and Afghanistan on mental health by gender”, Psychological Medicine, Vol. 42/9, pp. 1985-1996, https://doi.org/10.1017/S003329171100290X.

[79] World Health Organization (2001), AUDIT: The Alcohol Use Disorders Identification Test: Guidelines for use in primary health care, World Health Organization, https://www.who.int/publications/i/item/audit-the-alcohol-use-disorders-identification-test-guidelines-for-use-in-primary-health-care.

[81] World Health Organization (1998), Wellbeing measures in primary health care/the DepCare Project: Report on a WHO meeting, World Health Organization, https://apps.who.int/iris/handle/10665/349766.

[30] Yamazaki, S., S. Fukuhara and J. Green (2005), “Usefulness of five-item and three-item Mental Health Inventories to screen for depressive symptoms in the general population of Japan”, Health and Quality of Life Outcomes, Vol. 3/1, pp. 1-7, https://doi.org/10.1186/1477-7525-3-48/FIGURES/1.

[60] Zimmerman, M. et al. (2012), “How can we use depression severity to guide treatment selection when measures of depression categorize patients differently?”, Journal of Clinical Psychiatry, Vol. 73/10, pp. 1287-1291, https://doi.org/10.4088/JCP.12m07775.

Notes

← 1. Of course, this implies that diagnoses reached through clinical interviews are only as valid as the classification system they are based on (Mueller and Segal, 2015[3]) (see also Box 3.4 in Chapter 3).

← 2. Of course, the coverage of household surveys is also not complete and includes only those sampled. Typically, people living in institutional settings as well as the homeless (who are likely to have higher prevalence of mental ill-health than the general population) are not taken into account.

← 3. The following countries responded to the questionnaire: Australia, Austria, Belgium, Bulgaria, Canada, Chile, Colombia, Costa Rica, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Japan, Korea, Latvia, Lithuania, Luxembourg, Mexico, the Netherlands, Norway, New Zealand, Poland, Portugal, Slovenia, the Slovak Republic, Spain, Sweden, Switzerland, Türkiye, the United Kingdom and the United States.

← 4. The OECD also publishes administrative data on mental health service provision, such as the number of psychiatrists, psychologists or mental health professionals per 100 000 population; the number of hospital beds devoted to mental health care; spending on mental health services; etc. (OECD, 2021[17]). As these are not considered population-level mental health outcomes, they are not further considered for the purposes of this project.

← 5. Percentages do not add up to 68% because some countries did both: introduced new stand-alone surveys and added mental health modules to existing surveys.

← 6. Furthermore, it is worth noting that while the MHI-5 appeared in the well-being ad hoc modules for the 2013 and 2018 European Union Statistics on Income and Living Conditions (EU-SILC) survey administered by Eurostat, in future well-being modules the tool has been removed. Therefore, future use of the MHI-5 may be significantly diminished, although some individual member states may elect to keep the measure in their own national health and/or well-being surveys.

← 7. For an extended discussion of surveys used to measure attitudes and stigma towards mental health, refer to Table 6.2 in (OECD, 2021[1]).

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2023

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at https://www.oecd.org/termsandconditions.