CERC's Electronic Book

Doing Comparative Education: Three Decades of Collaboration


Part III: Achievement, Assessment, and Evaluating Learning

Comparative School Achievement
National Case Study Report
International Study of School Achievement
Reflections
The Two Faces of Examinations
Tradeoffs in Examination Policies: An International Comparative Perspective
Secondary School Examinations: International Perspectives on Policies and Practice
An International Perspective on National Standards
A Comparative Assessment of Assessment
An International Comparison of End-of-Secondary School Examinations

          
Source: Harold J. Noah, "Reflections," Comparative Education Review 31 (1987), pp. 137-149. Reprinted by permission of the University of Chicago Press.


REFLECTIONS


In a world of rampant nationalism, it is a wonder that the IEA exists at all. Schooling is only too often the very embodiment of nationalism, and nations are fiercely protective of the school systems they have built. To open them to international scrutiny requires a degree of willing cooperation, mutual trust, and genial forbearance that is, to say the very least, rare among nations. Thus, quite apart from the quality of IEA published results (which I happen to think is generally high) and their utility (which I think is great), the IEA shines forth as a good deed in a naughty world.*

Empire or Commonwealth?

Purves had, at one time, thought of titling his article "The IEA Empire" (personal communication from Postlethwaite). Better to call it "The IEA Commonwealth." From its beginnings to the present, the IEA has lacked all the attributes of empire. The association has never had, nor is there now, an emperor or empress (though the IEA has profited singularly from the steadfast, forceful, and nurturing leadership of Torsten Husén and T. Neville Postlethwaite). It has no fixed boundaries, no stable territories. Nations, and even parts of nations, join voluntarily, participate in this or that aspect of the IEA's work as they please and as they can raise the resources, and may freely secede. An empire requires imperial subjects; the IEA has none. An empire requires tribute, flowing from the periphery to the center; in the IEA, tribute is neither paid nor collected. Moreover, over time, the IEA has become progressively more decentralized. The international coordination of testing, data analysis, and even reporting is no longer located in a single headquarters (Hamburg and then Stockholm) for all projects; instead, each project finds an international home in one or another of the countries participating in the project. No empire was ever quite like this!

As Purves shows so clearly, one of the IEA's most important achievements has been to provide a framework within which research personnel can learn to improve their skills in analyzing the operation of their school systems. 1 Out of active international instruction and interchange of ideas, a vigorous fraternity of scholars and administrators has emerged, energizing and extending the reach of comparative education. Perhaps the IEA is best viewed not even as a commonwealth but rather as a latter-day incarnation of Robert Boyle's seventeenth-century "invisible college," operating on a global scale. For that reason, it is wholly fitting that over the years the Comparative Education Review has devoted so many pages to reports and comments on IEA work and now devotes a second entire issue to some of the association's current work. Its inquiries continue to represent the most sustained, imposing, and ultimately, I dare say, most heuristic work in comparative education that we have yet seen. As Philip Altbach has recently written, "Whatever criticisms one may have of aspects of the IEA research, it is perhaps the most widely cited and influential research done on an educational topic from a cross-national perspective." 2

Let me declare my interest. For a period of 9 years, 1967-76, 1 was a student enrolled in this "invisible college," working for a brief period in Stockholm but mostly in New York City, on materials that eventually appeared as the National Case Study volume in the Six-Subject Survey series. 3 Since then I have enjoyed alumnus status, having been discharged more or less honorably from the college. I owe a colossal debt to all those with whom I worked in and around the IEA for a postdoctoral education in comparative education provided during those 9 years. The reader should bear in mind, then, when reading what follows that I am definitely parti pris.

Country Mean Scores

From first to last, IEA spokespersons have consistently deprecated the alacrity with which commentators seize on the figures of mean national achievement scores, and they deny that the scores, taken without the benefit of a great deal of other knowledge, have much meaning. The tests were not devised primarily in order to make total score comparisons between countries possible and certainly not as yardsticks for an 'international contest.' "The mere fact that algebra and geometry items were included in the tests for the 13-year level in spite of the fact that these topics were not dealt with in some countries should discourage national comparisons." 4 Husén emphasizes the same point: "The media and the general public focused on the national means, in spite of the cautions issued by IEA researchers who tried to play down the tendency to perceive the exercise as an 'Olympic Game' or 'horse race.'"5

I take the contrary view. I continue to believe that, on the whole, it has been a strategic error for the IEA to take this cautionary stance to the national mean figures. First, it gives the impression that IEA researchers do not really believe that the means are valid and reliable; that they do not in fact reflect (relative to other nations) what schoolchildren in a given nation know of science, or mathematics, and so forth; and that the means might be significantly different if testing were to be repeated. This is an unfortunate impression to convey because the IEA has devoted an extraordinary and by all accounts successful effort to insure that the tests reflect the curriculum, that the samples of students to be tested are good probability samples, and that the scoring procedures are systematic and accurate. Moreover, pace Husén's observation cited above, even where it is true that those taking the test have not had the opportunity to learn some of the material in the test, that in itself is an important datum. A country that chooses to teach algebra to its 13-year-olds presumably gains an "advantage," which should be properly reflected in the country's total score.

Second, the protestations contra international comparisons of national mean scores do not ring true. They engender the kind of skepticism a parent might feel when his 13-year-old son is observed deeply immersed in the pages of Playboy magazine and who, when asked what he is doing, replies that he is studying the advertisements for audio equipment. The national mean scores may not be the most interesting aspect of IEA published results, but they are very high on the scale of legitimate as distinct from prurient interest -- and rightly so.

Third, national mean scores (given that they are valid and reliable statistics) have told us some very important things. They have told us that, for comparable populations in school among the industrialized nations, the range of the means is by no means large, nor is it consistent across school subjects and age-groups. Rather than deny that the country mean scores had importance, the more pertinent message would have been that schools on average do about as well by their pupils in one (developed) country as in another. This has been evident from the very beginning of IEA-type work, when the Pilot Study of School Achievement was conducted (June 1959-June 1961) under the auspices of the UNESCO Institute for Education in Hamburg. At the time, Thorndike had this to say:
It is clear that the variation between national means is small in relation to the variability of scores within any one country. [Between-country variability as a percentage of within-country variability ranged from a high of 16.2 percent (mathematics) to a low of 5.2 percent (science).] National differences represent a minor rather than a major component of these results. And the probability is that they are over-estimated rather than underestimated, because the countries that did relatively well on the tests were in several instances those that were known to have tested an up-graded sample of their populations. We suspect that with truly representative national samples, the differences would have been reduced. Of course, the participants in the survey were all countries with a basically European culture, and with well-developed educational systems. A greater heterogeneity in national cultures and educational levels would probably increase the national differences, perhaps substantially. 6

The first wholly-IEA study (the 12-nation mathematics study) produced results with similar implications: the country with the lowest mathematics score for 13-year-olds (Sweden) had a score only about one-quarter of a standard deviation from the grand mean; Japan, with the highest score, was about three-quarters of a standard deviation above the grand mean. The range from top to bottom was just over 1 standard deviation, or 15.5 points on a 70-point test. Six of the 10 countries clustered tightly around the grand mean (with a range of about 0.4 standard deviations, or 5.8 points). 7

Are these "big" differences? Husén thought so: "Differences between countries in average score are quite marked." 8 I suppose that what is big and what is small is a matter of perspective and context. But surely it would be imprudent to dismiss the hypothesis that, on the evidence of test scores of 13-year-old children in mathematics, Finland, the Netherlands, Australia, England, Scotland, and France are all nations "from the same population" and that the United States and Sweden are hardly much different. Whatever the variations in their educational and social systems, taken together they are not associated with much difference in national mean mathematics scores for this age-group.

For older students, the range of national mean scores opened up (1.6 standard deviations for the preuniversity grades taking mathematics; 1.5 for the parallel groups not taking mathematics). But, after adjusting for the differences in retentivity of 17-19-year-old students across the nations, Gilbert Peaker could observe that the data "strongly suggest that the amount of mathematical talent is very much the same in all these countries, though the policies used to develop it differ." 9 Alex Inkeles, commenting on the results of the Six-Subject Survey, is as definite. "I read these results as telling us that, given broadly comparable material to work with, the school systems of the more developed countries are, in general, turning out students of broadly comparable competence in the subject tested." 10

Rosier now provides us with preliminary cross-national results for science achievement, specifically, data on p-scores (percentage correct responses) on six items given to 14-year-old students in 14 countries/systems.11 Two of the countries are "developing" (Singapore and Thailand); the rest are "developed." Let us assume that the scores are representative of results on the entire test (a large assumption, but these data are all that we have at the time of writing). The overall mean is 57 (i.e., 57 percent of responses to all six questions in all 14 countries were correct). The cross-national range is 19.1 (Japan high, 69.2; Italy low, 50.1). However, three countries (Israel, the Netherlands, and Norway) fall within +3 points of the mean of 57; and seven countries (Australia, English-speaking Canada, England, Finland, Poland, Thailand, and the United States) fall within - 5 points. This is a noteworthy degree of clustering. It might be interesting to compare it with the clustering of national averages of 14-year-old children's heights or weights.

Of special interest is the position of Thailand, whose mean p-score of 54.3 is above England's 52.3, English-speaking Canada's 53.7, and Poland's 53.7. Here is a "developing" nation that may have raised the science education of its junior citizens to an impressive level -- a veritable défi thai. 12 In terms of GNP per capita, the World Bank ranks Thailand as fifty-fifth from the bottom of 126 countries listed. Compare Thailand's per capita income of $820 (in 1983 U.S. dollars) with the United Kingdom's $9,200 and Canada's $12,310. 13 Recall that one of the most solid crossnational findings of earlier IEA studies has been that mean achievement scores of the less developed countries (LDCs) are substantially and consistently lower than those of the more developed countries (MDCs). As between these two "worlds," it is as if we are indeed dealing with two different populations, each clustered quite closely about its respective grand mean. This is what Walker had to say in his summary volume on the Six-Subject Survey: "It was also apparent at an early stage that the material [for science achievement] from the developing countries (Chile, India, Iran, Thailand) would, in many of the analyses, require to be treated separately from that of the developed countries.... The most striking fact to emerge from the table is the large difference between the two groups of scores in Reading Comprehension, i.e., those of the three developing countries and those of the remainder."14

Should Thailand's preliminary results hold up when we have the complete figures, the implications are quite important: the "necessary" connection between a low level of income per capita and low level of school achievement will have been shown to be not necessary at all. Perhaps after reading Anderson's report of the Classroom Environment Study, we should not be surprised at this possibility. The conclusion he labels generalization 3, "the nature of classroom teaching is quite similar in all countries," is a powerful pointer to the results IEA has published. 15

Perhaps, though, the dispersions of achievement scores about the national means (the standard deviations) might tell a different story. A good deal of the IEA evidence is that they do not. Nations are really quite similar in the degree of dispersion of scores generated in association with schooling. This is hardly the place to present a detailed, comprehensive analysis, so let me simply cite results from the Six-Subject Survey and the 1967 mathematics study. In science achievement, population I (10-yearolds) and population 2 (14-year-olds) each exhibited remarkably similar cross-national standard deviations. For population 1, the grand mean of the country standard deviations was 7.9 (N = 12), and the range went only from 7.1 (French-speaking Belgium) to 9.3 (U.S.A.); for population 2, the grand mean was 11.8 (N = 14), and the range went from 8.8 (French-speaking Belgium again) to 14.8 Japan). Nor was the cross-national "dispersion of dispersions" much greater for population 4 (the preuniversity grade): grand mean was 9.9 (N = 14), low standard deviation was 7.9 (French-speaking Belgium, yet again), and high was 12.1 (Scotland). 16

Postlethwaite reported standard deviations of mathematics scores for population 1b (the grade in which most students are 13 years old). Again, with the exception of only one country, England, there is notably tight clustering of the standard deviations for the 12 nations about the international grand mean. 17

Within very broad limits, the industrialized nations taken as a whole do much the same things in their classrooms and secure broadly comparable results. The big differences are within-nation and even within-school.

Focusing on Within-Nation Differences

If the cross-national differences in school achievement are not large, does this mean that IEA work is fundamentally irrelevant to comparative education? There would have been a time in the development of the field when this would have been the received opinion. Comparative education was viewed as primarily, if not exclusively, concerned with describing, explaining, and perhaps profiting from knowledge of the differences among countries. This is certainly not the case today. We have come to recognize that a valid cross-national comparative study will often proceed by demonstrating the ways in which different combinations of factors are associated with broadly similar outcomes. This, more than the underlining of achievement differences, has been the largest knowledge-enhancing contribution of IEA achievement studies. At the end of a detailed critical review of the nine volumes reporting the Six-Subject Survey, Ellis Page made the point this way: "But if the IEA had done nothing more, it contributed a great deal in showing how standard were some of the patterns of explanation, when seen internationally. This is a high accomplishment, especially given the difficulty of interpretation of some of the questions asked about the schools, curriculum, etc., in an international context." 18

Evaluation of the Evaluations

People have taken notice of IEA work. A bibliography published in 1979 lists about 300 items from all corners of the world. 19 There have been many new publications since then. It is well beyond the capacity of any one reader to know, let alone assimilate. Technical reports, reports of analyses, policy reports, school-subject reports, country reports, international reports, reports on IEA organizational matters, doctoral dissertations,journal articles and reviews, secondary analyses, parallel studies -- the list goes on and on. Let me select just three pieces of scholarship from this enormous literature.

All three are evaluations of IEA work that have appeared in the U.S.: Inkeles's review of the nine volumes associated with the Six-Subject Survey; Page's review monograph, "The Methodology of International Evaluation of Educational Achievement"; and Theisen, Achola, and Boakari's article, "The Underachievement of Cross-national Studies of Achievement," which won the CIES award for the best article appearing in volume 27 of the Comparative Education Review. 20 Although all three pieces contain some generous huzzahs for IEA work, their overall tone is highly critical. I would like to pool those criticisms in as brief a compass as I can, comment on their appositeness, and suggest that they do not contain much that was not known to IEA researchers.

Inkeles faults IEA work for being "underanalyzed" in four main respects: inconsistencies in design, models of analysis, and reporting across the subject areas; failure to express the results in accessible form, so that comparisons across countries might be facilitated; failure to analyze the relation between school achievement and separate social groups (social class, race, religious, or ethnic groups); and failure to evaluate the influence of national context as a determinant of student performance. He expresses regret that the IEA chose to assign a low priority to analysis of the gain in scores between grades. He suggests that it is very important to identify children's capabilities at the point of entry to school, especially when trying to explain why children in the developing nations get low scores on the tests. In addition, Inkeles questions the wisdom of choosing populations on the basis of age, when the same age may mean quite different accumulated years of school attendance (grade level) in different countries.

Page faults the IEA for not spelling out the objectives of its work clearly enough (not as well "as in a typical doctoral dissertation"), although he does derive seven objectives from a foreword by Husén to the science study. 21 He criticizes what he judges to be an overwhelmingly "environmentalist" tone in IEA work and neglect of possibly important genetic factors to explain differences in school achievement. Specifically, he would have liked the IEA to test sibling pairs in order to control for genetic and home-background factors ("the IEA missed a great opportunity"). 22 Together with Inkeles, Page deplores what he sees as IEA's "fragmented view of the school curriculum." 23 Instead of seeing it as a whole, the IEA has chosen to study the curriculum subject by subject, even failing to test broadly the same set of students across all school subjects. He retains the gravest doubts about the cross-national validity of the tests, given the problems raised by translation, nor does he give the IEA high marks either for following Bloom's taxonomy of objectives or for assigning test items to just four of these categories. 24 Again, with Inkeles, he faults the IEA reporting for inconsistencies (especially the failure to maintain consistent definitions across the studies of each of the major explanatory "blocks" formed for regression analysis), confusing presentations, and gaps (no indexes, no lists of tables, on occasion no full presentation of the test items, and no systematic presentation of means and standard deviations of all variables, independent and dependent). 25 Finally, and also echoing Inkeles, Page expresses disappointment at what he sees as excessive tentativeness and ambiguity in the announced conclusions as to what makes a difference for school achievement both within and across countries -- "an uncertain trumpet" is his wonderfully chosen metaphor. 26

Inkeles and Page accept and even embrace the basic IEA approach via cross-national regression models, "merely" asking that it be done better. Not so Theisen, Achola, and Boakari. They argue that explanatory models formulated in terms of national units will be "difficult to standardize in terms of both variable inclusion and measurement." The principal need is to measure variables at the local idiosyncratic, rather than at the national aggregate, level. Otherwise, studies will most likely yield results that will be misleading when used at the local level, which is the "level [at which] educational policies, if they are to be effective, must be designed and implemented." 27 The major problem, they suggest, is with the sampling strategies,
which have been designed to reflect aggregate levels of achievement; individual students, not school systems or districts, have been the unit of analysis. Under most sampling schemes, one or two schools from a defined geographical or administrative boundary are selected in the first stage of a multistage sampling process. Subsequently, a handful of individual students are chosen randomly from these schools for inclusion in the final sample. As a result of the low number of students drawn from each school and because of randomly distributed variations in ability among the sampled elements, achievement data may only marginally reflect the importance of school and community characteristics on learning.... Furthermore, the statistics resulting from samples may be highly unstable as a result of the small naumber as of cases taken from each school. 28

This is a truly formidable massed array of criticisms, and it by no means exhausts either the total presented in these three commentaries or, of course, the additional criticisms that have appeared elsewhere. But they will certainly do for now. What can one say in response?

Virtually all of the points summarized above have been made and published by IEA researchers and report writers themselves. This is not to justify decisions made or not made in the course of IEA work but merely to underline the extent to which individual scholars worked with resources that were meager relative to the immensity of the tasks they had assumed, under deadlines that approached all too quickly, and ever mindful that international work forces compromises between the ideal and the possible. Such compromises inevitably end up closer to the possible than to the ideal. Permit me to cite in extenso from the closing pages of one IEA study, The National Case Study (NCS), that I happen to know quite well.

Future research should try to cover far more comprehensively than was possible in the NCS such matters as differences by country in home environments, the political context of schooling, the connection between jobs and educational credentials, and the style of student-teacher "transactions" inside and outside the classroom -- to name only a few.

Collection of time-series data on many variables proved to be very difficult, exceeding the statistical resources of most countries. Yet a good model of the factors influencing school achievement must be cast in terms of past inputs and processes. Insofar as social and educational systems change very slowly it is possible to use cross-section data as proxies for conditions in the past, and the NCS (and the other IEA studies) have relied upon the validity of this assumption. However, it should be recognized that it is an assumption, and research should continue to expand efforts to collect and use longitudinal data in explanatory models of cross-section achievement differences.

In international research in education, as in other social studies, cultural bias can be a serious pitfall. The multi-country collaborative nature of the IEA enterprise went far to eliminate such bias, and the instruments developed to measure student achievement appear to be, so far as is possible, neutral, or, if not neutral, at least not systematically biased with respect to national systems of education. Yet the very selection of school achievement as the criterion variable may be culturally biased, for it is a concept generated within and commanding particular attention in developed, Western-type societies. Even though achievement may include non-cognitive dimensions, such as associated skills and attitudes, even though it may be a value thoroughly acceptable to the educators of forward looking, but as yet less-developed countries, the IEA studies, and the National Case Study in particular, assume that school achievement is considered to be equally desirable (vis-à-vis other school and social objectives) among all people and in all countries. However, it may be that some people, countries or cultures simply value other outcomes of schooling (or of the use of young people's time and other national resources) more than they do achievement. And, while achievement appears to be a Western-generated criterion, it may not be an equally prized objective of schooling even among developed countries.

Although the subject committees gave varying amounts of attention to non-cognitive achievement in their respective areas (civic attitudes, literary tastes, science attitudes, for example, were particularly important and potentially illuminating), the NCS did not exploit the possibilities generated by these results. Future research should expand the analysis to investigate relationships between countries' system characteristics and such non-cognitive outcomes of schooling. Moreover, it is important to estimate the extent to which countries differ in the priority they accord to school achievement, compared to other outcomes of schooling.

For the most part, the statistical models employed in the subject studies explained only quite small fractions of the total observed variance in scores, between students and between schools. The reader will have noted also that the analyses employed in the NCS discovered few close fits between social, cultural, political and educational system characteristics and country achievement differences. There are a number of possible reasons for these, on the whole disappointing, results, reasons that apply as much to the explanatory models used in the subject studies as to the NCS approach.

First, measures on both the input and output sides, may contain large amounts of error. Error in the measurement of variables severely limits the explanatory power of statistical models, and in this respect the NCS probably suffered more than did the subject studies. Thus, the plight of the NCS Committee might be compared with that of a marksman condemned to shoot at a poorly defined target with rifle and bullets deficient in a number of unknown respects. Future work should give higher priority to improving the quality of measures, particularly on the input (independent variable) side of the equations.

Secondly, low explanatory power can arise from misspecification of the model. There are various aspects of this problem. There may have been omission of important variables. However, in view of the large number of variables used in both the subject studies and in the NCS, this is unlikely to have happened. A more damaging misspecification may have been the implicit assumption in the NCS study that, in general, the several explanatory variables can be used as if they are associated with achievement in a straightforward, additive and essentially non-interactive manner. If this assumption is incorrect, if in fact differences in the patterning of inputs are important for explaining differences in achievement, then the low values of correlation coefficients reported are not unexpected. Future NCS-type work, then, should try to assemble patterns of countries' system characteristics, and relate these to differences in school outcomes. If this is to be done, however, it will require a larger number of nations than was available from the Six Subject Survey, as well as the use of more sophisticated statistical techniques.

The third aspect of possible misspecification resides in the possibility that the significance (for outcomes) of a particular factor may not be its quantity, or even its role in some overall pattern, but its timing. Two countries' systems may supply equal amounts of several resources, but in one country the timing of these inputs may suit children's growth and achievement more than in the other. An analogy from agriculture is perhaps apposite. Two farmers may labor an equal number of hours on very similar fields and with similar equipment, seeds, fertilizer, and so on. If one farmer times his operations consistently better than the other, the size of their harvests can be expected to differ sharply. The NCS did not probe such questions in the context of school achievement, and future research might well try to do so.

The most interesting, and perhaps the most useful, approach to cross-national research proceeds not in terms of existing country-wide units, but on the basis of sub-national units. This means that it may be more interesting (for comparative work) to inquire about the correlates of achievement within, say, metropolitan areas across several countries, or among the children of the poor, or among girls, each group taken together across nations, than it is to regard individual countries as the logical, or only, units of analysis. This is not to deny, of course, that policy makers and researchers within each country will place first priority upon knowing the contributions made by such factors as metropolitan location, parental poverty, or sex to achievement in their own country. But that is by no means the end of the story, and comparisons across countries are inevitable. Here, the technique to be used is not simply a comparison of system-wide derived regression coefficients (although that can be valuable, especially if care has been taken to standardize definitions, measurement and scaling of variables across countries), but rather a cross-country pooling of data, partitioned by such factors as location, parental characteristics, teacher characteristics, levels of school finance, and so forth. This would enable the researchers to answer the question: How far, across all country units, among girls, are certain characteristics of homes, students and schools associated with achievement; how, if at all, do these associations differ among boys? Cognate questions could be asked about the production of achievement in metropolitan areas compared with rural areas, in poorly financed schools compared with well financed ones, and so on.

The NCS Committee did attempt a pilot investigation along these lines, but problems connected with the way schools were sampled defeated the attempt. It remains as a most fruitful line for crossnational research.

A final point concerning the value of partitioning samples for regression analysis bears on the utility of the results to policy makers. Because, in general, IEA regression analyses have taken the entire country sample as the universe to be explained, policy makers may find the results much less useful than they might wish. The results of linear multiple regression analyses done on aggregated country samples conceal a great deal of the information that they need. The reason for this becomes clear if one reflects on the nature of the information that a partial regression coefficient conveys. It tells the reader the average value of the strength of the association between the dependent variable and a particular independent variable, all other independent variables being held constant at their average values. Such an explanatory approach is able to clarify the connections between achievement and other factors, and to assign relative weights to the importance of one factor compared with others, on average, over the entire country. Policy makers, however, usually require something more detailed than such average indications. They need to know the effect upon achievement of varying a particular item under rather specific circumstances. Policy considerations are directed not at influencing achievement levels in "average," country-wide settings, but typically they deal with the problems of achievement in, say, rural vis-à-vis urban settings; of girls vis-à-vis boys; and of students in poor neighborhoods as distinct from wealthier neighborhoods. What is the incremental achievement value of increasing expenditures for schools in, say, poor neighborhoods? In wealthier neighborhoods? What provides the largest increments to achievement for low achievers? For average achievers? For high achievers? Thus, the policy makers' questions require analyses that do not hold the values of the other independent variables constant at their average country level, but which partition the total sample, in such a way that the associations between a dependent variable (say, achievement) and a particular independent variable (say, current expenditure per student) can be investigated separately for specified groups -- e.g., poor children and rich children; urban center children and rural children; poor urban center children and rich urban center children.

In a sense, then, the NCS study can be regarded as an inevitably weak substitute for within-country regression analyses done on partitioned samples. Comparing, say, India, Iran and Thailand with the United States, England and Germany does underline the contrast between the way home, school and student variables are associated with achievement among poorer people, as compared with more affluent people. But there are severe problems in relying upon aggregate cross-national analysis to fulfill this role: the number of dimensions for partitioning countries soon exhausts their number, so that cell sizes become very small, or even zero; the results are always subject to challenge on the ground that national idiosyncrasies are at the root of all observed results; it is obviously not expedient to have to deal with all the difficulties involved in cross-national measurement and scaling of variables, if within-country research can answer some of the important policy questions as well, or better; and, finally, it is often more difficult to see how results obtained from cross-national research can be applied to the problems of a given country.

The role of an NCS-type study in the future, then, becomes one of trying to use national system characteristics to assess and explain problems and paradoxes that remain after as much variance as possible has been explained using partitioned national samples and partitioned pooled data.

This report has, it is hoped, reconfirmed the potential of crossnational studies of schooling, based upon broad cultural, societal and educational measures. However, the caveats attached to findings are numerous, and the study is, perhaps, best regarded as an interim report. The goal of a coherent, reliable, persuasive explanatory model of school achievement differences is still far distant; and many problems of measurement, scaling, and comparability of variables remain intractable. The present state of the art in empirical comparative educational analysis leaves much to be desired. For we are all as yet in the position of one of W. S. Gilbert's comic opera criminals, endlessly expiating his misconduct, serving as a cautionary example to others:

     And there he plays extravagant matches
        In fitless finger stalls,
           On a cloth untrue
           With a twisted cue
     And elliptical billiard balls. 29

Similar lists of cautions, regrets, hopes, fears, and recommendations for specific changes are to be found throughout the reports of IEA studies and in the articles in this special issue.

Many of the Inkeles, Page, and Theisen et al. recommendations - especially those that ask for a more uniform approach to model building, variable identification, construct building, and statistical procedures -- require a much greater degree of central control than has proved possible in an international, voluntary organization, which attracts researchers with very different levels of expertise and sophistication. To the extent that the IEA is adopting an even more decentralized organization of its international work (partly because it is so difficult to raise funds for international coordination, and partly in order to reduce the weight of "advanced" countries in decisions), uniform approaches may become even more difficult to achieve. Nevertheless, criticisms voiced particularly by Inkeles and Page about the inconsistencies, relatively arcane and hence confusing statistical measures, and sheer gaps in reporting data and results are probably only too valid. On the evidence of the papers collected in this issue of the Comparative Education Review, they are being taken to heart, and a determined effort is being made to remedy the problems.

One point in conclusion is of particular importance for scholars in comparative education. There is a tendency to limit consideration of IEA work, its merits as well as its shortcomings, to just that work contained in the international studies of achievement in school subjects. This neglects the fact that there is as much, if not more, IEA literature written from the perspective either of a single nation or of a single topic other than school achievement. These works present further, often extremely detailed, analysis of data collected within a country or about a particular aspect of education. Naming only the merest few, I refer to Wolf's Achievement in America and Pidgeon's Achievement in Mathematics to illustrate the first category; and Postlethwaite's School Organization and Student Achievement, Noonan's School Resources,Social Class, and Student Achievement, Bergling's The Development of Hypothetico-Deductive Thinking in Children, and Purves and Levine's Educational Policy and International Assessment to illustrate the second. 30

It was said of Sir Christopher Wren, the great architect-rebuilder of London in the seventeenth century, Si monumentuum requiris, circumspice ["If you want to see his monument, look about you!"]. Similarly, if you want to see (the living) monument to the IEA, just take a look at the 1979 bibliography of works from, using, and about the IEA noted above. It reveals a formidable, if oft-times messy, piece of research architecture.

NOTES

* My assignment from Neville Postlethwaite, the guest editor of this special issue of the Comparative Education Review was to "write . . . on your reactions to IEA in general -- not just what is written in this special number." Insubordinate as ever, I have chosen instead to select a few, somewhat disjointed topics regarding the IEA that continue to intrigue me and that will, I hope, be of interest to the reader too. The subheads should make clear what I am about. [BACK]

  1. Alan C. Purves, "IEA: An Agenda for the Future," International Review of Education (in press). [BACK]

  2. Philip G. Altbach, "The Review at Thirty," Comparative Education Review 30 (1986), pp. 1-11, at 8. [BACK]

  3. A. Harry Passow, Harold J. Noah, Max A. Eckstein, and John R. Mallea, The National Case Study: An Empirical Comparative Study of Twenty-One Educational Systems (New York: Wiley, 1976). [BACK]

  4. Torsten Husén, ed., International Study of Achievement in Mathematics: A Comparison of Twelve Countries (New York: Wiley, 1967) 2: 26. [BACK]

  5. Torsten Husén "Policy Impact of IEA Research," Comparative Education Review 31 (1987). [BACK]

  6. Arthur W. Foshay et al., Educational Achievements of Thirteen- Year-Olds in Twelve Countries (Hamburg: UNESCO Institute for Education, 1962), p. 16. [BACK]

  7. Husén, ed., pp. 26-27. [BACK]

  8. Husén, ed., p. 26. [BACK]

  9. G.F. Peaker, "International Study of Achievement in Mathematics," Trends in Education 9 (1968): 42-48, at 45. [BACK]

  10. Alex Inkeles, "National Differences in Scholastic Performance," Comparative Education Review 23 (1979): 391. [BACK]

  11. Malcolm J. Rosier, "The Second International Science Study," Comparative Education Review 31 (1987). [BACK]

  12. There was a foretaste of Thailand's potential given in the results of the Six-Subject Survey: Thailand's 14-year-olds did about as well on the test of reading English as a foreign language as Italy's, and better than Finland's. See Glynn Lewis and Carolyn E. Massad, The Teaching of English as a Foreign Language in Ten Countries (New York: Wiley, 1975), p. 102. [BACK]

  13. World Development Report 1985 (New York: Oxford University Press, 1985), pp. 174-75. [BACK]

  14. David A. Walker, The IEA Six-Subject Survey: An Empirical Study of Education in Twenty-one Countries (New York: Wiley, 1976), p. 81, 113. [BACK]

  15. Lorin W. Anderson, "The Classroom Environment Study: Teaching for Learning," Comparative Education Review 31 (1987). [BACK]

  16. L.C. Comber and John P. Keeves, Science Education in Nineteen Countries (New York: Wiley, 1973), p. 159. [BACK]

  17. T. Neville Postlethwaite, School Organisation and Student Achievement: A Study Based on Achievement in Mathematics in Twelve Countries (New York: Wiley, 1967), p. 96. [BACK]

  18. Ellis B. Page, "The Methodology of International Evaluation of Educational Achievement," Proceedings of the National Academy of Education 5 (1978): 19-48, at 45. [BACK]

  19. T. Neville Postlethwaite and Arieh Lewy, Annotated Bibliography of IEA Publications (1962-1978) (Stockholm: IEA, University of Stockholm, 1979). The title is somewhat misleading because many of the items listed had no IEA imprimatur. [BACK]

  20. Inkeles, Alex, Ellis B. Page, Gary L. Theisen, Paul P.W. Achola, and Francis Musa Boakari, "The Underachievement of Cross-national Studies of Achievement," Comparative Education Review 27 (1981), pp. 46-68. [BACK]

  21. Comber and Keeves, pp. 10-11. [BACK]

  22. Page, p. 31. [BACK]

  23. Ibid., p. 32. [BACK]

  24. B.S. Bloom et al., Taxonomy of Educational Objectives, Handbook 1, Cognitive Domain (New York: McKay, 1956) [BACK]

  25. Page, pp. 43-45. [BACK]

  26. Ibid., p. 36. [BACK]

  27. Theisen et al., p. 47. [BACK]

  28. Ibid., p. 46 [BACK]

  29. Passow et al., pp. 291-95 [BACK]

  30. Richard M. Wolf, Achievement in America: National Report of the United States for the International Educational Achievement Project (New York: Teachers College Press, 1977); Douglas Pidgeon, Achievement in Mathematics: A National Study in Secondary Schools (Slough: National Foundation for Educational Research in England and Wales, 1967); Postlethwaite, School Organization and Student Achievement (n. 17 above); Richard D. Noonan, School Resources,Social Class, and Student Achievement (New York: Wiley, 1976); Kurt Bergling, The Development of Hypothetico-Deductive Thinking in Children (New York: Wiley, 1974); Alan C. Purves and Daniel U. Levine, Educational Policy and International Assessment: Implications of the IEA Surveys of Achievement (Berkeley: McCutchan, 1975). [BACK]

Back to Top
Go to Electronic Book's Contents
Go to CERC's Main Page
To obtain a copy of the book, order from CERC