Reader’s guide

Mental health outcomes are defined differently throughout this report, depending on the specific mental health outcome, and specific tool, that is used. This section provides more detailed explanations of each mental health screening tool, divided into sections focused on mental ill-health (encompassing outcomes relating to mental distress and depression) and positive mental health. For more information about tools that can be used to measure population level mental health outcomes, refer to Measuring Population Mental Health (OECD, 2023[1]).

Cross-country data on population mental health outcomes remains patchy. Therefore throughout this report, data from existing internationally comparative surveys are used to illustrate the interlinkages between mental health and different well-being domains, details of which can be found in this section. The country coverage is primarily European, given data availability. Similarly, the mental health tools used in this report reflect current availability of internationally harmonised data on mental health, rather than an endorsement of any one mental health screening tool.

Mental Health Inventory (MHI-5) The Mental Health Inventory-5 (MHI-5) is a five-item scale to screen for symptoms of psychological distress. It is drawn from the 38-item Mental Health Inventory (MHI) and included in the 20-item and 36-item versions of the Short Form Health Survey (SF-20 and SF-36) (Berwick et al., 1991[2]; Kelly et al., 2008[3]). The questions tap into both negative and positive affect, with three items focusing on low/depressed mood and two on nervousness/anxiety (although the tool itself is not used to present these aspects separately). The MHI-5 has been found to be a reliable measure of mental health status, and has been validated against both depressive, and to a lesser degree also anxiety disorders (including generalised anxiety and panic disorder) in general population and patient samples in a range of countries (Yamazaki, Fukuhara and Green, 2005[4]; Hoeymans et al., 2004[5]; Elovanio et al., 2020[6]; Gill et al., 2007[7]; Rumpf et al., 2001[8]; Strand et al., 2003[9]; Thorsen et al., 2013[10]). There is some evidence that removing the two anxiety-related items does not reduce the effectiveness of the MHI in detecting depression, although this has not been examined in studies in which a formal diagnosis according to clinical criteria was used as a gold standard (Yamazaki, Fukuhara and Green, 2005[4]).

This report uses the MHI-5 screener tool as it is used in the 2013 and 2018 waves of the European Union Statistics on Income and Living Conditions (EU-SILC) surveys. For the figures in this report, those whose MHI-5 score is less than or equal to 52 are considered to be at risk for mental distress, and those whose score is greater than 52 are not considered to be at risk.

General Health Questionnaire (GHQ-12) The 12-item General Health Questionnaire (GHQ-12) is a measure to detect psychological distress by focusing on affect (negative and positive), somatic symptoms and the functional impairment of respondents. The GHQ-12 has been translated into many languages and extensively validated in general and clinical populations worldwide (particularly against depression and anxiety disorders), including among adolescent samples (Hankins, 2008[11]; Gilbody, 2001[12]; Baksheev et al., 2011[13]). Originally intended as a unidimensional measure, there is some debate about the dimensionality of the GHQ-12, with many factor-analytical studies supporting a range of multidimensional structures (e.g. anxiety and depression, social dysfunction, loss of confidence) (Gao et al., 2004[14]). However, more recent evidence points to these results likely being an expression of method-specific variance caused by item wording, supporting the notion that treating the scale as a unitary construct would minimise bias (Hystad and Johnsen, 2020[15]). The GHQ-12 is subject to copyright restrictions and can thus not be republished in this report.

This report uses the GHQ-12 as it appears in the United Kingdom Household Longitudinal Study (UKHLS), for inclusion in cross-lagged panel models illustrating the bidirectional relationship between selected well-being outcomes and mental distress.

Patient Health Questionnaire (PHQ-9 / PHQ-8) The full Patient Health Questionnaire (PHQ) contains 59 questions, with modules focusing on mood, anxiety, alcohol, eating and somatoform disorders. PHQ-9 is a nine-question survey designed to detect the presence and severity of depressive symptoms, and it directly maps onto the DSM-IV and DSM-5 symptom criteria for major depressive disorder. The PHQ-8 questionnaire removes the final question regarding suicidal ideation. While a one-factor structure for both PHQ-8/9 has been identified, more recent studies support a two-factor model composed of affective and somatic factors (Sunderland et al., 2019[16]). Both instruments have shown acceptable diagnostic screening properties across various population and clinical settings, age groups, and cultures/ ethnicities, in addition to being also a reliable and valid measure of depression severity (Manea, Gilbody and McMillan, 2012[17]; Moriarty et al., 2015[18]; Kroenke et al., 2009[19]; Huang et al., 2006[20]; Kroenke, Spitzer and Williams, 2001[21]; Richardson et al., 2010[22]). The close alignment between the PHQ-8/9 and the DSM make it subject to the same criticism, including a potentially Western-focused construct of depression, relative to longer self-reported scales with less constrained symptom sets (Zimmerman et al., 2012[23]; Haroz et al., 2017[24]).

This report makes use of the PHQ-8, as it appears in the European Health Interview Survey (EHIS). For the figures in this report, those whose PHQ-8 score is greater than or equal to 10 are considered to be at risk for depression, and those whose score is less than 10 are not considered to be at risk. Other scoring conventions are possible (refer to the Notes section of Table 4 below).

WHO-5 Well-being index (WHO-5) The World Health Organization Well-Being Index (WHO-5) is a short questionnaire of 5 questions that focus on a respondent’s positive affect. The questionnaire, adapted from the longer WHO/ICD-10 Depression Diagnosis and DSM-IV Depression scale by selecting a subset of positively phrased items, has first been used in a project on well-being measures in primary health care by the WHO Regional Office in Europe in 1998 and since been translated into more than 30 languages (WHO, 1998[25]; Topp et al., 2015[26]). The WHO-5 has been applied as a generic scale for well-being across a wide range of study fields and countries, as a sensitive screening tool for depression as well as an outcome measure in clinical trials (Topp et al., 2015[26]). Studies of younger and elderly persons indicated a unidimensional structure for this scale (Topp et al., 2015[26]).

This report makes use of the WHO-5, as it appears in the 2016 wave of the European Quality of Life Survey (EQLS). For the figures in this report, those whose WHO-5 score is less than or equal to 52 are considered to have poor psychological well-being, and those whose score is greater than 52 are considered to have good psychological well-being. This scoring convention has been used by a number of publications, including Eurofound (Sándor et al., 2021[27]; WHO Collaborating Center for Mental Health, 1998[28]), however other scoring conventions are possible (see notes of Table 5 below).

The Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) The 14-item WEMWBS scale was developed with funding from NHS Health Scotland in 2005 to measure mental well-being (conceived of as “both feeling good and functioning well”), taking the Affectometer 2 instrument as starting point (Warwick Medical School, 2021[29]). Some studies confirmed a unidimensional structure for WEMWBS, while others identified three residual factors relating to affective well-being, psychological functioning or eudaimonia, and social relationships (Shannon et al., 2020[30]; Koushede et al., 2019[31]). A shorter, 7-item version of the scale, the SWEMWBS is also available, focusing slightly less on affect (Stewart-Brown et al., 2009[32]). (S)WEMWBS has been validated in various populations and among different subgroups including adolescents, clinical samples and ethnic minority samples, has been translated into more than 25 languages and validated in Norwegian, Swedish, Italian, Dutch, Danish, German, French and Spanish. Both scales have been shown to be sensitive to changes that occur in mental wellbeing promotion and mental illness treatment and prevention projects (Koushede et al., 2019[31]). Both instruments can distinguish mental well-being between subgroups, but the SWEMBS has been found to be less sensitive to gender differences compared to the longer version (Koushede et al., 2019[31]; Ng Fat et al., 2017[33]).

The cross-lagged panel models included in this report use SWEMWBS as a measure of psychological flourishing (see section on the cross-lagged panel model below for further details).

Figures throughout this report show the share of the population in OECD countries who experience some form of a well-being deprivation, spanning the full range of dimensions of the OECD Well-being Framework, disaggregated by mental health outcome: the overall population, those at risk for poor mental health, and those not at risk for poor mental health. The mental health outcome categories vary depending on the type of mental health tool used (refer to the above section). Well-being deprivations are binary categories, for example: those who are unemployed, who live below the poverty line, who are lonely, who do not feel safe walking alone at night, who feel excluded from society, etc. The following tables in this section provide more information on how each deprivation is constructed, along with a brief introduction to the data sources.

The European Union Statistics on Income and Living Conditions (EU-SILC) survey has been conducted annually since 2003 in affiliated European countries. The core module primarily focuses on income, employment and housing outcomes, however each year a rotating ad-hoc module is implemented to conduct a deep-dive on a specific topic. In 2013 and 2018, the ad-hoc module focused on well-being, and included the five questions comprising the Mental Health Inventory (MHI-5) screening tool for mental distress (see Table 3 above for more information).

The well-being deprivations (and in some cases, resilience factors, meaning above a threshold of well-being in a given dimension) described in Table 7 below all come from the 2018 EU-SILC survey, with the exception of those marked by an asterisk (*), which come from the 2013 survey wave. The ad hoc well-being modules in 2018 and 2013 have overlapping question sets, but they are not identical to one another, therefore some indicators included in 2013 were not repeated five years later. Not all of the indicators in the below table come from the ad-hoc module; many of those relating to material conditions are included in the core module and are thus collected annually.

There are 26 OECD member states with EU-SILC data in this report: Austria, Belgium, the Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg, the Netherlands, Norway, Poland, Portugal, the Slovak Republic, Slovenia, Spain, Sweden, Switzerland and the United Kingdom.

The European Health Interview Survey (EHIS) collects data on health status and determinants for citizens of EU countries. All of the figures in this report use data from the second wave of data collection in 2014.

EHIS includes the eight individual questions that make up the Patient Health Questionnaire (PHQ-8) to measure risk for major depressive disorder (refer to Table 4 for more information). Table 8 below lists the different well-being deprivations that are included in this report.

There are 22 OECD member states with EHIS data in this report: Austria, the Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg, Norway, Poland, Portugal, the Slovak Republic, Slovenia, Sweden and the United Kingdom.

The European Quality of Life survey (EQLS) is, unlike EU-SILC and EHIS, not conducted by national statistical offices, but administered by Eurofound to monitor quality of life indicators in Europe. The data used in this report come from the 2016 round of data collection. The survey includes the World Health Organization (WHO-5) affect-based measure of positive mental health (see Table 5 for more information). Table 9 below lists the well-being deprivations that are included in this report.

There are 24 OECD member states with EQLS data in this report: Austria, Belgium, the Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, the Netherlands, Poland, Portugal, the Slovak Republic, Slovenia, Spain, Sweden, Türkiye and the United Kingdom.

This report contains figures depicting visual representations of cross-lagged panel models (CLPM), which are used to examine the reciprocal relationships between different dimensions of well-being and mental health outcomes (Mayer, 1986[37]; Selig and Little, 2013[38]). CLPMs are commonly used in behavioural and psychological research to infer longitudinal relations between variables (Saeri, Cruwys and Sibley, 2018[39]; Yu et al., 2015[40]).The aim of these analyses is to offer a more holistic perspective of the relationships between well-being and mental health by evaluating bidirectional relationships between different dimensions of well-being and population mental health outcomes, over time.

The CLPMs in this report use data from Understanding Society: The United Kingdom Household Longitudinal Study (UKHLS), which provides nationally representative, high-quality longitudinal panel data. UKHLS consists of a stratified and clustered general population sample (GPS) of randomly selected respondents from approximatively 30 000 households. The first wave was conducted from 2009-11, with respondents re-interviewed annually (the latest wave for which data were available for this report is wave 10, 2019-21). Only 52% of the GPS sample were still participating after six years. Attrition was reported to be greatest among the youngest age groups, men, the Black population, people on lower incomes and in the Greater London area. No strong association was found between the attrition rate and health status for the GPS sample (Lynn and Borkowska, 2018[41]). Data collection is conducted face-to-face via computer aided personal interviews with additional self-completion instruments. Comprehensive descriptions of the techniques and methodology used are published elsewhere (Berthoud et al., 2009[42]), as are sampling methodologies (ISER, n.d.[43]).

Different balanced panel sub-sample datasets were used to analyse the bidirectional relationships between mental health and different well-being indicators, depending on data availability for each pair of outcomes considered. That is, the maximum number of survey waves available for each mental health outcome and well-being indicator were used, resulting in different sample sizes for different models.

Two contrasting indicators of mental health were examined to better understand the potential differences in well-being relationships to positive mental health vs. mental ill-health, both measured on a continuous scale. The General Health Questionnaire 12 (GHQ-12) is used to measure mental distress, ranging from 0 (the least distressed) to 12 (the most distressed). The short Warwick-Edinburgh Mental Well-Being Scale (SWEMWBS) captures positive mental health, with scores ranging from 7 (worst mental health outcome) to 35 (best possible outcome). Data for the GHQ-12 are available across all 10 waves, while data for the SWEMWBS are only available in waves 1, 4, 7 and 10.

CLPMs estimate the relationship between both mental health and a well-being indicator over time as well as the reciprocal temporal association between mental health and a well-being indicator, allowing for autoregressive and cross-lagged pathways. The models estimate the effect of one variable on the other, while controlling for levels of the outcome variable in the previous wave (“stability” effect). A simultaneous equation model that allows for autoregressive effects and cross-lagged effects between mental health outcomes (MHit) and well-being indicators (WBit) at each measure point (t = 1, …, 10) may be written as:

MHit= αtMHi,t-1+ βtWBi,t-1+ Xi+υi,t  (1)

WBit= δtWBi,t-1+ γtMHi,t-1+ Xi+νi,t        (2)

where t represents an occasion or survey wave, i represents an individual, αt and δt are autoregressive parameters, βt and γt are cross-lagged parameters, MHi,t-1 and WBi,t-1 are the lags of one time unit for mental health and well-being outcomes, respectively, Xi is a vector of control variables that vary over individuals but not over time, and υi,t and νi,t are the residuals (assumed to be normally distributed and correlated).

The autoregressive parameters are included to account for the stability of the constructs over time: the closer these autoregressive parameters are to one, the more stable the rank order of individuals is from one occasion to the next. Applying this to a mental health outcome, the closer the mental health autoregressive parameter is to one, the more that previous mental health outcomes influence current mental health. The cross-lagged parameters investigate reciprocal causal effects between well-being and mental health outcomes in this model. The relative effects of the mental health and well-being variables γt on each other can be directly compared by standardising βt and γt, which can then be used to determine causal predominance, or comparing the impact of previous experience of mental ill-health on a current well-being outcome (say, unemployment) with the impact of previous unemployment on current mental ill-health.

The models shown in this report are the simple CLPMs described above, with standardised coefficients, for ease of interpretation. However CLPMs are estimated based on aggregating between- and within-person variance, therefore as a robustness check random-intercept cross-lagged panel models (RI-CLPM) to isolate the within-person from between-person variation in models with continuous dependent variables were also estimated. The findings were more-or-less in line with those of the CLPMs, and thus are not shown in this report.

The two equations of the CLPM are estimated simultaneously and are adjusted for the complex sample design of UKHLS data including weighting, clustering and stratification at household level (residential addresses) (Lynn, 2009[44]).

First, unconstrained models (non-stationary models), allowing the magnitude of the autoregressive and cross-lagged effects to vary over survey waves were run, followed by a stationary model constraining lagged structural and autoregressive paths to be equal over each survey wave. In this report results from the stationary CLPM models are reported for simplicity, and because the improvement in model fit between stationary and non-stationary was often minimal.

In models with continuous dependent variables (i.e., household income, which is not shown in this report), full-information maximum likelihood (FIML) was used to reduce potential bias introduced by missing data. FIML is a more efficient way of dealing with missing data than listwise or pairwise deletion or similar response pattern imputation (Enders, Bandalos and And Bandalos, 2001[45]). In models with binary dependent variables (i.e., respondent smokes tobacco; respondent is unemployed), the Weighted Least Square Mean and Variance (WLSMV) is used.

The relative strengths of cross-lagged relationships are compared using standardised coefficients. In stationary models, the unstandardised coefficients are constrained to be equal over time, but standardised coefficients vary across waves. Consequently, when presenting results, we report standardised coefficients for the effects between the first two waves of data used.

Model fit was assessed by the Comparative Fit Index (CFI) and the Root-Mean-Square Error of Approximation (RMSEA). Fit is considered acceptable if CFI ≥0.90 and RMSEA ≤0.08, and good if CFI ≥0.95 and RMSEA ≤0.06. (Brown, 2015[46]; Hu and Bentler, 2009[47]; Chen et al., 2008[48]). All analyses were performed using Mplus and the R “MplusAutomation” package (Hallquist and Wiley, 2018[49]).

As with any estimation, CLPMs require a number of assumptions in order to infer causality. For instance, it is assumed that time-invariant covariates have a constant effect across time, which may not necessarily be the case (Mund, Johnson and Nestler, 2021[50]). Additionally, neither models control for time-varying variables. A few additional points to be mindful of when interpreting or comparing results from the models in this report:

  • Differences in the strength of cross-lagged effects across mental health outcomes could be due to the mental health outcomes used measuring different underlying constructs (for example, SWEMWBS also contains questions related to general life satisfaction and social connectedness).

  • It may not be surprising to see attenuated effects or less/non-significant effects when using SWEMWBS as the mental health outcome since SWEMWBS data are only available every three years, meaning the model estimates the effects of three-year rather than one-year lags.

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2023

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at https://www.oecd.org/termsandconditions.