Annex A. Technical notes on TALIS-PISA link data

The objective of the Teaching and Learning International Survey (TALIS) and Programme for International Student Assessment (PISA) link, referred to as TALIS-PISA link1 in the report, was to obtain, in each participating country and economy, representative samples of 15-year-old students, teachers teaching 15-year-old students and schools with students of this age that can be linked and analysed together. Thus, in each country and economy that opted to participate in the TALIS-PISA link option of TALIS 2018 (hereafter “TALIS-PISA link 2018”), a random sub-sample was drawn from the PISA sample of schools. The international sampling plan prepared for TALIS-PISA link 2018 and PISA 2018 used a stratified two-stage probability sampling design. This means that teachers and students (second stage units, or secondary sampling units) were to be randomly selected from the list of in-scope teachers and students in each of the randomly selected schools (first stage units, or primary sampling units). A more detailed description of the survey design and its implementation can be found in the TALIS 2018 Technical Report (OECD, 2019[1]) and the PISA 2018 Technical Report (OECD, 2020[2]).

The international target population of TALIS-PISA link 2018 restricts the survey to those teachers and principals who work in schools surveyed by PISA that provide instruction for 15-year-old students. Only teachers who teach regular classes to PISA-eligible students in ordinary schools surveyed by PISA are covered by the TALIS-PISA link. Teachers working in schools exclusively devoted to children with special needs are not part of the international target population and are deemed out of scope. Teachers working with special needs students in a regular school setting were considered in scope in TALIS-PISA link. However, when a school is made up exclusively of these teachers, the school itself is said to be out of scope. Teacher aides, pedagogical support staff (e.g. guidance counsellors and librarians) and health and social support staff (e.g. doctors, nurses, psychiatrists, psychologists, occupational therapists and social workers) were not considered to be teachers and, thus, not part of the TALIS-PISA link international target population.

In PISA 2018, the international target population of students includes those who are aged between 15 years and 3 months and 16 years and 2 months at the time of the assessment and who are enrolled in school and have completed at least 6 years of formal schooling, regardless of the type of institution in which they are enrolled, and whether they are in full-time or part-time education, whether they attend academic or vocational programmes, and whether they attend public or private schools or foreign schools within the country. The PISA 2018 target population does not include residents of a country who attend school in another country. However, it does include foreign nationals who attend school in the country of assessment.

For national reasons, participating countries and economies could choose to reduce their coverage of the target population by excluding, for instance, a small, remote geographical region due to inaccessibility, or language differences, possibly due to political, organisational or operational reasons, or presence of special education needs students. However, efforts were made to keep these exclusions to a minimum (i.e. up to a total of 5% of the relevant teacher and student populations, either by excluding schools or excluding students and teachers within schools).

The following categories of schools were excluded from the sample:

  • schools that were geographically inaccessible or where the administration of the PISA assessment was not considered feasible

  • schools that provided teaching only for students in the categories defined under “within-school exclusion of students”, such as schools for the blind (i.e. a school attended only by students who would be excluded from taking the assessment for intellectual, functional, or linguistic reasons was considered a school-level exclusion).

Within a selected in-scope school, the following categories of teachers were excluded from the sample:

  • teachers teaching in schools exclusively serving special needs students

  • teachers who also act as school principals: no teacher data collected, but school principal data collected

  • substitute, emergency or occasional teachers

  • teachers on long-term leave.

Within a selected in-scope school, the following categories of students were excluded from the sample:

  • students with an intellectual disability (i.e. a mental or emotional disability resulting in the student being so cognitively delayed that he/she could not perform in the PISA testing environment)

  • students with a functional disability (i.e. a moderate to severe permanent physical disability resulting in the student being unable to perform in the PISA testing environment)

  • students with limited assessment-language proficiency (i.e. students who were unable to read or speak any of the languages of assessment in the country at a sufficient level and unable to overcome such a language barrier in the PISA testing environment; these were typically students who had received less than one year of instruction in the language of assessment)

  • other exclusions, a category defined by the PISA national centres in individual participating countries and approved by the PISA international consortium

  • students taught in a language of instruction for the major domain for which no materials were available.

However, students could not be excluded solely because of low proficiency or common disciplinary problems. The percentage of 15-year-olds excluded within schools had to be less than 2.5% of the national desired target population.

In each country/economy, a sample of 150 schools (unless discussions with the national project manager led to a different size) was drawn randomly from the sample of schools drawn for PISA 2018. Within each participating school, 35 students were randomly selected. However, in schools with 35 or fewer eligible students, all students were selected. As the PISA 2018 data collection proceeded, the set of (original sample or replacement) schools participating in PISA emerged and, thus, the set of schools where the TALIS-PISA link 2018 should be administered. Within each of the schools that had participated in PISA 2018 and also sampled for the TALIS-PISA link 2018, the school principal and a random sample of 20 teachers teaching 15-year-old students were surveyed. However, in schools with 20 or fewer eligible teachers, all teachers were selected. Thus, the nominal international sample size of the TALIS-PISA link 2018 was set at 150 schools, 3 000 teachers and 5 250 students for each participating country or economy.

Final weights allow the production of country-level estimates from the observed sample data. The estimation weight indicates how many population units are represented by a sampled unit. The final weight is the combination of many factors, reflecting the probabilities of selection at the various stages of sampling and the response obtained at each stage. To maintain the unbiasedness of the estimates, other factors may also come into play as dictated by special conditions (e.g. adjustment for teachers working in more than one school).

Estimating the sampling error for surveys with complex designs, such as TALIS and PISA, requires special attention. TALIS and PISA adopted the balanced repeated replication (BRR) for estimation of the sampling error of the estimates. BRR is a replication method suited to sample designs where exactly two primary sampling units (PSUs) are selected in each stratum. It leads to (approximately) unbiased estimates of sampling error.

As mentioned above, the sample of schools for the TALIS-PISA link was a subset of the sample of schools selected to take part in PISA 2018. Given the sequencing of events between TALIS and PISA, the sampling team could not limit sub-sampling for the TALIS-PISA link to schools that had participated in PISA. Therefore, they had to draw the sub-sample from the full sample of schools prior to the PISA data collection. However, because data collection for the TALIS-PISA link was scheduled to take place after completion of the data collection for PISA (at least, for any given school), the school base weight was that of the PISA 2018 design, adjusted for sub-sampling.

The final TALIS-PISA link school weight (estimation weight [SCHWGT]) is the product of the TALIS-PISA link school base weight and the TALIS-PISA link school non-response adjustment factor. The final TALIS-PISA link teacher weight (estimation weight [TCHWGT]) is the product of the TALIS-PISA link teacher base weight, the three adjustment factors associated with each participating teacher (i.e. non-response adjustment within the school, and multiplicity and exclusion adjustments) and the final TALIS-PISA link school weight. Balanced repeated replicate weights for both teacher and school observations [TRWGT1-TRWGT100 and SRWGT1-SRWGT100] can be used to obtain (approximately) unbiased estimates of sampling errors.

To conduct student-level analyses based on the student-level merged TALIS-PISA dataset (i.e. student data merged with principal data and teacher data aggregated at the school level), the final TALIS-PISA link student weight (estimation weight) as well as the TALIS-PISA link student-level balanced repeated replicate weights need to be estimated. These can be done using the following steps:

  • Compute the within-school component of student weights (i.e. the student base weight, adjusted for non-response, and trimmed if needed – for more detail, see Chapter 8 of the PISA 2018 Technical Report (OECD, 2020[2]) – by dividing the final student weight included in the PISA 2018 student dataset [W_FSTUWT] by the final school weight included in the PISA 2018 school dataset [W_SCHGRNRABWT].2

  • Compute the final TALIS-PISA link student weight as the product of the within-school component of student weights and the final TALIS-PISA link school weight included in the TALIS-PISA link 2018 school/principal dataset [SCHWGT]. This ensures that the non-response adjustments accounting for school non-response specific to the TALIS-PISA link sub-sample are applied.3

  • Compute the TALIS-PISA link student-level balanced repeated replicate weights by multiplying the within-school component of student weights by each of the 100 school-level balanced repeated replicate weights that are included in the TALIS-PISA link school/principal dataset [SRWGT1-SRWGT100].

The technical standards for both TALIS-PISA link 2018 and PISA 2018 outlined the response rate (participation rate) requirements for their respective target populations (i.e. schools, teachers and students). Although reaching the required levels of participation does not preclude some degree of error in the results, it should reduce reliance on the “missing at random” assumptions made for the non-response weighting adjustments.

TALIS-PISA link 2018 set the minimum school response rate at 75% of sampled eligible and non-excluded schools after replacement. TALIS-PISA link considered schools where the principal returned a questionnaire to be “participating” schools for the purposes of the school weights and database. Although replacement schools could be called upon as substitutes for non-responding schools, the study’s national project managers were encouraged to do all they could to obtain the participation of the schools in the original sample. Countries that reached less than 75% school participation after replacement had to demonstrate convincingly that their sample was not significantly biased. The minimum response rate for teachers was set at 75% of all sampled teachers across all participating schools. TALIS-PISA link considered schools where at least 50% of selected teachers responded to the questionnaire to be “participating” schools for the purposes of the teacher weights and database, regardless of their participation status on the school database, that is, regardless of whether or not their principal returned his or her questionnaire.

PISA 2018 set the minimum school response rate at 85% of sampled eligible and non-excluded schools. If the initial school response rate fell between 65% and 85%, an acceptable school response rate could still be reached through the use of replacement schools. Furthermore, a school with a student participation rate between 25% and 50% was not considered as a participating school for the purposes of calculating and documenting response rates. However, data from such schools were included in the database and contributed to the estimates included in the initial PISA international report. Data from schools with a student participation rate of less than 25% were not included in the database, and such schools were regarded as non-respondents. The minimum response rate for students was set at 80% of all sampled students across all participating schools. A minimum student response rate of 50% within each school was required for a school to be regarded as participating: the overall student response rate was computed using only students from schools with at least a 50% student response rate.

Data adjudication is the process through which each national dataset is reviewed and a judgement about the appropriateness of the data for the main reporting goals is formed. For both TALIS-PISA link 2018 and PISA 2018, the basic principle that guided the adjudication was to determine, for each participating country and economy, whether the data released to the countries and economies are fit to provide policy relevant, robust international indicators and analysis on students, teachers and teaching in a timely and cost effective manner.

To establish fitness for use, a number of quality assurance processes were designed and activated throughout the survey process. Some processes relied on expert advice and opinion; some relied on qualitative information and learned judgement; some relied on quantitative information. School, teacher and student data received separate adjudication evaluation per country and economy. The issues evaluated concerned the questionnaire adaptation to national context, translation and verification, quality of the sampling frame, handling of out-of-scope and refusal units, within-school sampling, data collection, data cleaning, the reports of quality observers, participation rates and overall compliance with the technical standards. Once each survey process had been assessed, a recommended rating was formulated, accounting for the participation rates, and for any unresolved issue.

The adjudication of the TALIS-PISA link 2018 samples had to wait until the PISA 2018 samples had been adjudicated, as the former was dependent on the latter to allow the final determination of the recommended rating. Even if the recommended rating for TALIS-PISA link, based solely on what happened during the preparation and collection of the TALIS-PISA link, was “good”, if the data or samples from PISA were to be rated less favourably, the matched file could not be adjudicated as “good”. It could only be adjudicated as the weakest, at most, of either rating. For more detailed information, please refer to the TALIS 2018 Technical Report (OECD, 2019[1]) and the PISA 2018 Technical Report (OECD, 2020[2]).

The adjudication rules for the TALIS-PISA link 2018 samples, based on participation rates for principals and teachers, are displayed in Tables A A.1 and A A.2. An explanation of the codes used is given below.

The following bulleted list is a simple guide aimed at helping data users appreciate the limitations on use or quality:

  • Good: the participating country’s/economy’s data can be used for all reporting and analytical purposes and can be included in international comparisons.

  • Fair (A): national and sub-national estimates can be produced; some teacher characteristics may suffer from a larger standard error (SE), hence the warning “Fair” and no additional warnings to users appear necessary.

  • Fair (B, only for teacher data adjudication): national and sub-national estimates can be produced; some sub-national estimates may be of lower precision (larger SE) if sample size is locally low, hence the warning “Fair” and no additional warnings to users appear necessary.

  • Fair (C):

    • national and sub-national estimates can be produced.

    • some sub-national estimates may be of lower precision (larger SE) if sample size is locally low, hence the warning “Fair”, but a note on data quality could appear pointing to the outcome of the non-response bias analysis (NRBA).

    • since school participation is somewhat lower than under (B), comparing sub-national estimates should be done with care, as some of those results are based on few schools.

    • comparing small sub-national estimates with similar groups from other countries is likely to uncover any statistically meaningful differences, as the SE are likely too large.

  • Poor (D):

    • in addition to the warnings issued for the previous category, a note should warn users of indications of non-response biases in some estimates.

    • comparisons of sub-national estimates should be limited to the groups with the larger sample sizes.

    • at this point, the sample represents between 37% and 56% of the teaching workforce, from a rather small sample of schools.

    • comparisons with similar groups in foreign countries would not be encouraged.

  • Poor (E, only for teacher data adjudication): sub-national estimates would not be recommended; there should be a note pointing out the difficulty of obtaining a representative sample of schools.

  • Poor (F, only for teacher data adjudication): limitations similar to those of line E, but there should be a note pointing out the difficulty of obtaining at least 50% participation of the selected sample of schools; risks of having a non-representative sample of schools.

  • Insufficient: weights should not be calculated for any official tabulations; hence, data should not be incorporated into international tables, models, averages, etc.

Tables A A.3 and A A.4 display the participation rates for the principals and teachers in each country/economy that participated in the TALIS-PISA link.

Data from participating school principals, participating teachers and participating students from the same schools can be merged as result of the school-level link.

Table A A.5 displays the number of observations for each category of participants, once their data are merged at the school level. The school-level merged TALIS-PISA dataset (i.e. student data aggregated at the school level merged with principal data and teacher data aggregated at the school level) has 1 058 school-level observations. The teacher-level merged TALIS-PISA dataset (i.e. teacher data merged with principal data and student data aggregated at the school level) has 17 809 teacher-level observations. The student-level merged TALIS-PISA dataset (i.e. student data merged with principal data and teacher data aggregated at the school level) has 31 077 student-level observations.

References

[2] OECD (2020), PISA 2018 Technical Report, https://www.oecd.org/pisa/data/pisa2018technicalreport/ (accessed on 16 November 2020).

[1] OECD (2019), TALIS 2018 Technical Report, OECD, Paris, http://www.oecd.org/education/talis/TALIS_2018_Technical_Report.pdf.

Notes

← 1. TALIS-PISA link: Teaching and Learning International Survey (TALIS) and Programme for International Student Assessment (PISA) link covers schools that participated in both TALIS and PISA.

← 2. Non-response adjustment applied to the within-school component of student weights is, in some cases, based on “non-response classes” from the full PISA dataset (including schools that are not part of the TALIS-PISA link sub-sample). However, this is considered to be a minor issue.

← 3. Since the TALIS-PISA link dataset is a subset of the PISA dataset, school non-response adjustments specific to the TALIS-PISA link sub-sample also account for non-participation in PISA. However, there are five schools – three in Colombia and two in Georgia – that participated in TALIS-PISA link 2018, but were considered non-participants in PISA 2018. These schools were originally sampled for PISA, but were finally left out of the final PISA database. However, they were sampled for the TALIS-PISA link and participated in it so they were included in the final TALIS-PISA link datasets.

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2021

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at http://www.oecd.org/termsandconditions.