Establishing local scales and baseline measures from the IEA Civic Education Studies Database

Kui Foon CHOW, Trevor Grahame BOND

Research output: Contribution to conferencePapers


Empirical Research Background: Our analyses and report in this study were based on data from the IEA Civic Education Study conducted by the Evaluation of Educational Achievement (IEA) in 1999. The IEA Civic Education Study is the largest and most rigorous study of civic education ever conducted internationally. This research tested and surveyed nationally representative samples totalling 90,000 14-year-old students in 28 countries. In our study the analyses and report focuses on the Hong Kong sample: 4,996 students aged 14.00-14.11 from 150 schools in Hong Kong. Empirical Research Aims: Our study aims to construct benchmark estimates for scales to be used to measure changes in citizenship attitudes and knowledge in Hong Kong students since 1999. In that case it is necessary to reveal any extra scales and/or variations of dimensions within a scale from the Hong Kong sample that were not reported in the IEA Technical report published in 2004, where common scales were adopted for all country comparisons. Empirical Research Sample: An instrument containing attitudinal items towards citizenship, including sections A to N, was administered to students in 28 countries, including Hong Kong, during the Civic Education Study (CivEd) of the International Association for the Evaluation of Educational Achievement (IEA) in 1999. A two stage stratified cluster sample design was used in line with those adopted for other large scale assessments. This involved schools being selected within a stratified sampling frame as the first stage and single intact classes being selected in the second stage. Two explicit strata were used in the sampling frame: district and financial mode, with the number of schools selected within each stratum proportional to the total number of schools. 150 Hong Kong schools were included in the sample in order "to obtain sufficient data for reliable analysis at school and class levels". The population was made up of students who are from 14.00-14.11 (Form 3). A total of 4,996 such students from 150 schools in Hong Kong finally made up the final sample in the data. Empirical Research Method: Our re-analyses were done on data of the 14-year-old Hong Kong students‘ responses to attitudinal items from section A to section N collected under the Civic Education Study (CivEd) of the International Association for the Evaluation of Educational Achievement (IEA) in 1999 to establish some measures of the students‘ views towards various aspects of citizenship in 1999. These measures will serve as baseline measures. To reveal any changes in their attitudes towards citizenship, these measures could be compared with the measures obtained from another group of Hong Kong students on the same set of items. Empirical Research RASCH: Scale construction – Clusters of items in each section A to section N (section by section) were subjected to Rasch analyses. Fit statistics were used to detect misfitting items. Within section A to N, those items whose fit statistics did not meet the criteria (both MNSQ < 1.33 and ZTD < 2.0) were removed from further analyses; these items were B1, B15, D9, D12, E6, E11, E12, G6, H6, H7, L9, L10 and M1. The remaining i.e., fitting items were ready for further analyses. Establishing baseline measures – Item difficulty estimates were derived from the measures obtained in 1999 using Rasch modeling. Calibrations were centred on mean person measure=0 (not usual software default). For each of the remaining (non-misfit) items, graphs centering the average person measure as the reference point were produced to show the item measures and standard errors of measurement for each item. For each identified scale, the values of the category threshold parameters were shown in tables to serve as baseline measures for later comparison purposes. When later cohort of students are administered the same instrument and items, and if they show the same attitudes to citizenship as those in 1999, the item measures should remain the same (within the standard errors of measurement) as the item estimates in 1999. In contrast, if later cohort of students showed different attitudes to citizenship with those in 1999, the item measures should move out of the standard errors of measurement of the items in 1999. In theory, in terms of their attitudes to citizenship, the students in the two times are then said to be samples from different populations. Empirical Research Results: Scale construction – In the original CivEd study, a total of 54 items from the 162 administered were used to construct 11 scales. For the Hong Kong data alone, it was possible however to include a total of 148 items into 20 scales. Of course, the consequence is that some of the Hong Kong only scales will not have the same meaning as those scales used for the total 28 country comparisons. For example, in the CivEd G scale, two dimensions were suggested: “equality” and “prohibition”. Apparently, section G contains items related specifically to the roles of women but in fact no female-related dimension was showed. This showed a discrepancy between the intention and realization in the measurement properties of the scale. The dimensions suggested by the Rasch analysis might lose a focus: very often many similar items within a section were grouped together to form a scale. The intention of the CivEd scale developers to put items into one section was taken at face value. However, for example, within scale A, named “concept of democracy”, some items, such as A5, A16, A19, didn‘t seem to relate much to the concept of democracy in the context of Hong Kong, even though all the 25 items showed no misfit. Only a few items were included in some scales. In scale J, only 5 items were remained to form the scale for “school participation”. It is questionable whether these 5 items only are able to measure precisely enough. In other words, these 5 items might not distribute evenly on the scale to produce a precise estimation of the “school participation”. Empirical Research Conclusions: Rasch analyses help construct for-HK-only scales that will contain the maximum number of appropriate items within a section (i.e., without losing some of items due to their insufficiently good measurement properties in data across all others countries). Rasch analyses also help reveal within any section multiple dimensions that were not reported in the official IEA Technical Report. Upon the establishment of the baselines measures of the attitudes towards citizenship among students in HK, later when cohorts of students are administered the same instruments and data are collected, change in the students‘ attitudes towards citizenship will be revealed.
Original languageEnglish
Publication statusPublished - Jul 2009


Chow, J. K. F., & Bond, T.(2009, July). Establishing local scales and baseline measures from the IEA Civic Education Studies Database. Paper presented at the Pacific Rim Objective Measurement Symposium 2009 (PROMS 2009) Hong Kong, The Hong Kong Institute of Education, China.


Dive into the research topics of 'Establishing local scales and baseline measures from the IEA Civic Education Studies Database'. Together they form a unique fingerprint.