3. CLA+ International

Doris Zahner
Council for Aid to Education
United States
Kelly Rotholz
Council for Aid to Education
United States

Using the lessons learnt from the Assessment of Higher Education Learning Outcomes (AHELO) feasibility study (Dias and Amaral, 2014[1]; Ewell, 2012[2]; Lalancette, 2013[3]; Tremblay, 2013[4]; Wolf and Zahner, 2016[5]; Wolf, Zahner and Benjamin, 2015[6]), the Council for Aid to Education (CAE) and the OECD executed a Memorandum of Understanding (MOU), which allowed the two organisations to continue collaborating on assessing higher education students’ generic skills. The intent of the MOU was for the collaboration between the OECD Directorate for Education and Skills and CAE to enable tertiary education institutions and jurisdictions to develop and implement innovative performance-based assessments to measure the generic skills of higher education students (Table 3.1). The collaboration included activities such as:

  • marketing and recruiting

  • translations and adaptations of CLA+

  • test administration

  • scoring training

  • scoring and reporting

  • international benchmarking

  • publications and presentations.

The MOU stipulated that for its execution no financial transactions were supposed to happen between CAE and the OECD. Financial arrangements were bilaterally concluded between the participating jurisdictions and CAE without any involvement of the OECD. The limited expenses at the OECD, mainly to compensate for the time spent by staff, were covered by grants from countries from the AHELO feasibility study.

Two approaches were used for recruiting participants in CLA+ international: top-down and bottom-up. For the top-down strategy, CAE collaborated with the OECD, as outlined in the MOU for ministry-level participation. The OECD approached representatives from possibly interested OECD member countries and partners for jurisdiction participation. The OECD also hosted a series of conferences which convened all participants in the initiative as well as interested jurisdictions. A total of six conferences were convened (Table 3.2).

For the bottom-up strategy, CAE engaged CAE Fellows to recruit individual institutions or regional organisations and consortia. The CAE Fellows assist CAE by introducing and supporting CLA+ for their regions. CAE Fellows are selected based on their expertise in the fields of education and assessment. Their responsibilities include recruitment of institutions, presenting at regional conferences, and reporting regional updates during project meetings. CAE staff and leadership are also an integral part of recruiting and presentations (Table 3.3).

The first collaboration, starting in 2013, was between CAE and the Italian National Agency for the Evaluation of Universities and Research Institutes (ANVUR). ANVUR’s decision to participate in CLA+ International was a direct result of AHELO. Italy participated only in the engineering strand of the feasibility study and was interested in assessing the generic skills of their students.

This was followed by student learning gain studies with the University of Guadalajara (23 campuses), a consortium of four “post-1992” (Hannah, 1996[7]) institutions and a public university in the United Kingdom, several individual institutions across Chile, and 18 universities and universities of applied sciences in a study sponsored by the Ministry of Culture and Education of Finland. Most recently, a large, private university system in Mexico with six campuses joined the initiative with the intention of assessing over 2 200 students and focusing on measuring student learning gains both within individual campuses and fields of study, as well as across the entire system.

The purposes of individual countries and institutions participating in CLA+ International varies from needing individual students assessed in order to meet university graduation requirements to understanding the level and quality of students’ generic skills within the higher education system. Individual chapters in Part III of this manuscript provide detailed information on each participating country or region.

Prior studies of CLA+ International have been published about individual countries or institutions (Shek et al., 2016[8]; Zahner and Ciolfi, 2018[9]; Zahner and Kostoris, 2016[10]; Zahner et al., 2020[11]; Zlatkin-Troitschanskaia et al., 2018[12]). However, this is the first research study aggregating all CLA+ International data and reporting results within and across countries/regions. Results from individual countries, regions, and institutions can be found in Part III of this manuscript. Perspective and potential future projects using CLA+ International can be found in Part IV.

Since CLA+ was an existing valid and reliable instrument (Zahner, 2013[13]): Chapter 3) and had already been developed for students in the United States, no new versions of the assessment were made for CLA+ International. Rather, Performance Tasks (PTs) and Selected-Response Questions (SRQs) were selected by a committee of CAE measurement scientists and international higher education educators and administrators to be translated, adapted and administered internationally.

Three PTs and three sets of SRQs were selected for international use. These PTs and SRQs were chosen because the topics were relevant and relatable across multiple cultures and contexts, and performed well operationally when used in the United States. Ministries or large consortiums who participated in CLA+ International had the opportunity to select the PT and sets of SRQs to be used for their international participation. CAE presented the three options to them and recommended a single set if the group was interested in internationally comparable results. All consortiums opted to use the set that allowed for internationally comparable data.

In the case of Latin America, CAE worked with individual institutions directly to deliver CLA+ International. In this circumstance, CAE selected the items to be used and oversaw the process of translation and adaptation.

The translation and adaptation process was led by CAE and its translation partner cApStAn in collaboration with country team members, following industry best practices (Geisinger, 1994[14]; Hambleton and Li, 2005[15]). The translation and adaptation of a performance assessment is a more complex process than simple word-for-word replacement from one language to another. At CAE, translation and adaptation experts ensure that the translated and adapted assessments are consistent with the original version in the source language and, just as importantly, will be interpreted by students in their native language as intended. CAE’s experts confirm that the assessment topics possess the same authenticity, context and meaning for the target student population as they do for the original student population for which the tasks were initially developed. CAE uses an internationally accepted five-step translation process in compliance with International Translation Committee (ITC) guidelines, the same guidelines used for the localisation process of major international studies such as the Programme for International Student Assessment (PISA), Trends in International Mathematics and Science Study (TIMMS), Progress in International Reading Literacy Study (PIRLS), Programme for the International Assessment of Adult Competencies (PIAAC), and AHELO. This process includes:

  1. 1. Translatability review: Source material is reviewed to confirm that the text will adapt well into member languages and cultures. Particular attention is paid to disambiguation of source, respecting key correspondences between stimuli and questions, and determining what should or should not be adapted to local context.

  2. 2. Double translation and reconciliation: Two translators independently review the text and provide translations. The translations are reconciled and sent to country teams for review.

  3. 3. Member Team Review: Members are provided with an opportunity to review the translated work and provide input and recommendations.

  4. 4. Focused Verification: The verification process ensures that the translated and adapted assessment is consistent with the context and intent of the original assessment.

  5. 5. Cognitive Labs, as appropriate: With CAE's guidance, members conduct cognitive labs with a small sample of student participants to ensure that the translated and adapted assessment is clear and consistent with the context and intent of the original assessment.

CAE followed this process for translation and adaptation for use in Italy, Finland and Mexico. For the first translation and adaptation of the CLA+ International in Spanish for Spanish-language countries in addition to Mexico, a modified adaptation process was used given the short timeline for this first administration. cApStAn reviewed the translation that was completed for the Mexican Spanish test content and made adaptation recommendations that are regionally appropriate for use in Chile. These recommendations were reviewed and approved by the measurement science team at CAE.

Following best practices in translating and adapting assessments (Geisinger, 1994[14]; Hambleton and Li, 2005[15]), cognitive labs (Leighton, 2017[16]; Zucker, Sassman and Case, 2004[17]) were recommended to improve and verify the quality of translated and adapted assessments. Additionally, the cognitive labs were used to confirm that the cognitive processes and reasoning elicited by the translated and adapted CLA+ International assessment were consistent with and aligned to the constructs measured by CLA+. More specifically, the cognitive labs were intended to ensure that the translation and adaptation of CLA+ from English into additional languages: 1) did not alter the constructs measured; 2) was interpreted by the participants in the ways originally intended; and 3) was not more difficult for the country’s participants to read and understand than it would have been had the tasks been originally written in the participant’s native language.

In-country project staff assigned an interviewer to conduct the cognitive labs with voluntary participants using the printed version of the assessment. If deemed necessary, a revised version of each translated and adapted assessment was prepared based on information taken from the cognitive labs.

The CLA+ International cognitive labs were carried out in three stages:

  1. 1. Training, in which the interviewer explained the purpose of the cognitive lab and trained participants to think aloud with small tasks.

  2. 2. Think-aloud, in which participants provided concurrent verbal reports of their thinking as they engaged in the task. During the think-aloud, the interviewer took notes about reasoning processes as well as potential translation and/or adaptation issues.

  3. 3. Follow-up interview, in which the interviewer asked scripted questions with the intent of eliciting additional information on the clarity of the translation and adaptation of the assessment, translation and adaptation issues and participants’ strategies for coming up with their answer or solution.

In addition to the interviewer’s note-taking, the cognitive lab for each participant was audio recorded. Project staff working in teams listened to the recordings to identify potential unintended challenges that may have resulted from the translation and adaptation. Based on the analysis of the participants’ think-aloud protocols and responses to follow-up questions, the project staff identified ways in which the translation and adaptation of the assessment needed improvement. These findings were shared with CAE, and all necessary adjustments and edits were implemented.

Following the translation and adaptation, and cognitive lab processes, CAE worked with all participating institutions to finalise the assessment prior to test administration. CAE provided guidance to participating members on improving student recruitment efforts, proctor training and test day administration preparations. CAE also provided technical support before, during and after test administration.

  • CAE’s secure, scalable online test platform was translated into the appropriate language.

  • All test materials and scoring rubrics were translated into the appropriate language.

  • Administrative instructions and guides were provided to member teams.

  • Cognitive labs were performed with CAE oversight.

  • Exemplary best practices, communications materials, training, and logistics guidance were provided.

  • Student recruitment was carried out with CAE support.

  • Test was administered using a secure online testing platform.

  • Training of Lead Scorers was carried out by CAE.

  • Member scorers were scored using CAE’s secure online scoring platform.

  • CAE review and analysis of data was followed by preparation of member reports.

  • Individual student reports and secure badges/certificates were prepared and distributed by CAE and/or the member.

CLA+ International is administered through an Internet-based testing platform. Test-takers enter the exam through a secure browser that locks down unnecessary computer functions and distributes a 60-minute PT and a 30-minute, 25-item SRQ section to each student.

  • The PT asks students to craft a written response to an open-ended question about a hypothetical but realistic scenario using a library of relevant documents (Document Library).

  • The SRQs ask students to choose the best response to questions in the categories of Scientific and Quantitative Literacy, Critical Reading and Evaluation, and Critiquing an Argument.

All testing sessions require a proctor to authorise students into the interface and manage the testing environment. The assessment is designed to be completed in approximately 90 minutes. At the beginning of the testing session, there is an optional tutorial that students can scroll through or bypass if they so choose. The assessment requires standardised administration to ensure consistent testing conditions for all students. CAE provides training materials for Institutional Administrators and Proctors.

For CLA+ International, all student responses were double-scored by human scorers fluent in the native language of the student. The training for the scoring process was directed by the CAE Measurement Science team and started with a group training for all Lead Scorers from all participating members/institutions. Trainings were most often conducted in-country as a two-day in-person training.

CAE recommended appointing a Lead Scorer and an Assistant Lead (or Co-lead) Scorer to attend this training to better distribute the information and responsibilities that followed. In-person training was conducted for participants in the UK, Italy, Mexico, and Finland. Colleagues in Mexico opted for CAE to oversee the scoring process for their participation. The training occurred once per test administration. This training was conducted in English, utilising American student exemplary responses. All scoring took place and was monitored on CAE’s platform. Data analyses and reporting then followed. The Lead Scorer and Assistant Lead Scorer underwent rigorous training to become part of the CLA+ International scorer team.

The scoring training for the PTs included:

  • an orientation to the prompts and scoring rubrics/guides

  • repeated practice grading a wide range of student responses

  • extensive feedback and discussion after scoring each response.

Following this training, CAE team members acted as a resource for the Lead Scorer and Assistant Lead Scorer, who were responsible for recruiting and training the member’s team of scorers. This ensured quality and consistency both within and across countries. The scorers were recruited from participating institutions according to their ability to judge university student generic skills. Institutions often appointed professors, institutional research fellows, post-doctoral associates or doctoral students to score the student responses. Scorers were often remunerated by the participating member universities at their discretion. CAE scorers for the Spanish student responses were remunerated at the same rate as the scorers who worked on the American English student responses.

All scoring took place and was monitored on CAE’s secure platform. Trained scorers received a randomised selection of anonymised student responses within the relevant language and entered their score results directly into CAE’s Internet-based scoring platform. The scorers did not know the institution to which each student belonged. CAE’s system automatically monitored human scorer calibration and inter-rater reliability and notified the Lead Scorers of any scorers who were not appropriately calibrated.

A calibration verification system was developed to improve and streamline scoring. Calibration of scorers through the system required scorers to score previously scored results, or “Verification Papers”, when they first start scoring, as well as throughout the scoring window. The system periodically presented Verification Papers to scorers in lieu of student responses though they were not flagged to the scorers as such. The system did not indicate when a scorer had successfully scored a Verification Paper but if the scorer failed to accurately score a series of Verification Papers, the scorer was removed from scoring and had to participate in a remediation process. At this point, scorers were either further coached or removed from scoring.

In order to provide students with scale scores, CAE converted the raw scores to scale scores using a procedure called equating. The purpose of equating is to have a common scale of measurement. Equating permits comparisons of student groups across time, regardless of the sets of items that were administered. The equating procedure that CAE used was linear transformation. The result was a set of equating constants that convert the raw scores to scale scores for a PT or a given set of SRQs. Details of the procedure are described in the Appendix of Chapter 3. The same steps are followed for the domestic and international student groups.

Each country or association received its own set of reports and data files, including a report showing how, overall, its participating universities performed as a group. Additionally, individual university reports were prepared for each participating university. CAE provided all members with comparative information from CAE’s domestic national data from the United States. Finally, CAE prepared individual student reports for all participating students and issued badges for those whose scores merited a proficient, accomplished, or advanced score.

Part II of this manuscript offers detailed insight into the combined CLA+ International data set.

References

[4] Blömeke, S. (ed.) (2013), “OECD Assessment of Higher Education Learning Outcomes (AHELO): Rationale, challenges and initial insights from the feasibility study”, Modeling and measuring and initial insights from the feasibility study, pp. 113-126.

[1] Dias, D. and A. Amaral (2014), “Assessment of Higher Education Learning Outcomes (AHELO): An OECD Feasibility Study”, pp. 66-87, https://doi.org/10.1057/9781137374639_5.

[2] Ewell, P. (2012), “A World of Assessment: OECD’s AHELO Initiative”, Change: The Magazine of Higher Learning, Vol. 44/5, pp. 35-42, https://doi.org/10.1080/00091383.2012.706515.

[15] Frisby, C. and C. Reynolds (eds.) (2005), Translation and Adaptation Issues and Methods for Educational and Psychological Tests, John Wiley and Sons, Hoboken, NJ.

[14] Geisinger, K. (1994), “Cross-Cultural Normative Assessment: Translation and Adaptation Issues Influencing the Normative Interpretation of Assessment Instruments”, Psychological Assessment, Vol. 6/4, p. 304, https://doi.org/10.1037/1040-3590.6.4.304.

[7] Hannah, S. (1996), “The higher education act of 1992: Skills, constraints, and the politics of higher education”, Journal of Higher Education, Vol. 67/5, pp. 498-527, https://doi.org/10.1080/00221546.1996.11780274.

[3] Lalancette, D. (2013), OECD assessment of higher education learning outcomes (AHELO): Rationale, challenges and initial insights from the feasibility study, McGill-Queen’s University Press, Montreal, https://doi.org/10.1007/978-94-6091-867-4.

[16] Leighton, J. (2017), Using Think-Aloud Interviews and Cognitive Labs in Educational Research, Oxford University Press, Oxford, https://doi.org/10.1093/acprof:oso/9780199372904.001.0001.

[5] Rosen, Y., S. Derrara and M. Mosharraf (eds.) (2016), Mitigation of Test Bias in International, Cross-National Assessments of Higher-Order Thinking Skills, IGI Global, Hershey, PA, https://doi.org/10.4018/978-1-4666-9441-5.ch018.

[8] Shek, D. et al. (2016), “Assessing learning gains of university students in Hong Kong adopting the Collegiate Learning Assessment Plus (CLA+)”, International Journal on Disability and Human Development, Vol. 15/3, p. 331, https://doi.org/10.1515/ijdhd-2015-6001.

[6] Wolf, R., D. Zahner and R. Benjamin (2015), “Methodological challenges in international comparative post-secondary assessment programs: lessons learned and the road ahead”, Studies in Higher Education, Vol. 40/3, pp. 471-481, https://doi.org/10.1080/03075079.2015.1004239.

[13] Zahner, D. (2013), Reliability and Validity – CLA+, Council for Aid to Education, New York, NY.

[9] Zahner, D. and A. Ciolfi (2018), “International Comparison of a Performance-Based Assessment in Higher Education”, in Olga Zlatkin-Troitschanskaia et al. (eds.), Assessment of Learning Outcomes in Higher Education: Cross-National Comparisons and Perspectives, Springer, New York, https://doi.org/10.1007/978-3-319-74338-7_11.

[10] Zahner, D. and F. Kostoris (2016), International Testing of a Performance-Based Assessment in Higher Education, Council for Aid to Education, Washington, DC.

[11] Zahner, D. et al. (2020), “Measuring the generic skills of higher education students and graduates: Implementation of CLA+ international”, in Assessing undergraduate learning in psychology: Strategies for measuring and improving student performance, https://doi.org/10.1037/0000183-015.

[12] Zlatkin-Troitschanskaia, O. et al. (2018), “Adapting and Validating the Collegiate Learning Assessment to Measure Generic Academic Skills of Students in Germany: Implications for International Assessment Studies in Higher Education”, Springer, Cham, https://doi.org/10.1007/978-3-319-74338-7_12.

[17] Zucker, S., C. Sassman and B. Case (2004), Cognitive Labs, Harcourt Assessment, San Antonio, TX.

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2022

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at https://www.oecd.org/termsandconditions.