I. Performance Over Time
Long-Term Trends in Science, Mathematics, and Reading
Measuring students' academic performance has been the purpose of the National Assessment of Educational Progress (NAEP) since its inception in 1969. Students in both public and nonpublic schools have been assessed in various subject areas on a regular basis. In addition, NAEP collects information about relevant background variables to provide an important context for interpreting the assessment results and to document the extent to which education reform has been implemented.
NAEP enables us to monitor trends in academic achievement in core curriculum areas over an extended period of time. To do so, NAEP readministers materials and replicates procedures from assessment to assessment, always testing students in the same age groups (9, 13, and 17). In this manner, the long-term trends NAEP provides valuable information about progress in academic achievement and about the ability of the United States to achieve its national education goals.
To provide a numeric summary of students' performance on the assessment questions and tasks, NAEP uses a 0 to 500 scale for each subject area. Comparisons of average scale scores are provided across the years in which the NAEP long-term trend assessments have been administered and among subpopulations of students. These results chart trends from the first year in which each NAEP assessment was given: 1969/70 in science; 1971 in reading; 1973 in mathematics; and 1984 in writing.
Trends in average performance over these time periods are discussed for students at ages 9, 13, and 17 for science, mathematics, and reading. In general, the NAEP long term trends in science and mathematics show a pattern of early declines or relative stability followed by improved performance; in reading, minimal changes have occurred over the assessment period.
Science. The overall pattern of performance in science for 9-, 13-, and 17-year-olds is one of early declines followed by a period of improvement (Figure A). For 9-year-olds, the overall trend shows improvement; in 1996, the average score for these students was higher than in 1970. The overall trend for 13-year-olds was also positive, but there was no significant difference between the average science scores in 1970 and those in 1996. The average science score of 17-year-olds in 1996 was lower than the average score in 1969. Science scores have been increasing upward for all ages tested since 1982 and the publication of A Nation at Risk. Average scores at all three ages were higher in 1996 than in 1982 (for 17-year-olds, scores increased by 13 points; at age 13, scores increased 6 points, and at age 9, scores increased 9 points).
Mathematics. The overall pattern of mathematics achievement for 9-, 13-, and 17-year-olds shows overall improvement, with early declines or relative stability followed by increased performance (Figure B). Further, the scores of 9- and 13-year-olds were significantly higher in 1996 than in 1973. As with science, mathematics scores have also shown an upward trend at all ages since 1982 and the publication of A Nation at Risk. On average, the scores of 17-year-olds increased 8 points; 13-year-olds increased 5 points; and 9-year-olds increased 12 points.
Reading. The overall trend pattern in reading achievement is one of minimal changes across the assessment years (Figure C). The performance of 9-year-olds improved from 1971 to 1980, but has declined slightly since that time. However, in 1996, the average reading score for these students was higher than it was in 1971. Thirteen-year-olds showed moderate gains in reading achievement; in 1996, their average reading score was higher than that in 1971. There was an overall pattern of increase in reading scores for 17-year-olds, but the 1996 average score was not significantly different than in 1971. Reading scores have remained fairly stable between 1984 and 1996, the time period immediately following the release of A Nation at Risk. No significant changes at any age occurred during this time period.
Subgroup Performance on NAEP
Analyses of NAEP assessment data by race show how achievement gaps have been changing over time. In mathematics and reading, score gaps between white and black students aged 13 and 17 narrowed during the 1970s and the 1980s. Although there was some evidence of widening gaps during the late 1980s and 1990s, the score gaps in 1996 were smaller than those in the first assessment year for 13- and 17-year-olds in mathematics and for 17-year-olds in reading. Among 9-year-olds, score gaps in mathematics and reading have generally decreased across the assessment years, resulting in smaller gaps in 1996 compared to those in the first assessment year.
Since A Nation at Risk, performance in science has been increasing for white, black, and Hispanic students at ages 9, 13, and 17. At age 17, for example, average scores of white students increased 14 points from 1982 to 1996; for black students the increase was 25 points; and Hispanic students improved by 20 points. As a result of these increases, the gap between white and black students closed significantly (although it is still 47 points); the gap between white and Hispanic students also narrowed, though the change was not statistically significant (the gap in 1996 was 38 points).
Average mathematics scores of white, black, and Hispanic students also increased since 1982. For 17-year-olds, for example, white students improved 9 points; black students improved 14 points; and Hispanic students increased 15 points. The gaps between white and black students narrowed between 1982 and 1990, but has widened again through the 1990s, to 27 points in 1996. The gap between white and Hispanic students narrowed somewhat since 1982, though the change was not statistically significant, and the gap remained at 21 points in 1996.
Changes in reading were minimal for white, black, and Hispanic students at all ages during the years 1982 to 1996. As a result, the gaps between white and black students remained about the same (in 1996 the gap at age 17 was 29 points). The gap between white and Hispanic students also changed little (in 1996 the gap at age 17 was 30 points).
In looking at subgroup performance in NAEP, it is particularly interesting to examine how gains made by subgroups over time can be masked by simple averages. Whenever the demographic balance among subgroups shifts, it can result in what is sometimes termed "Simpson's paradox" - which is illustrated by the NAEP long-term reading gains of 9 year-old whites, blacks, and Hispanics compared to the overall average gains shown in (Figure D). Between 1971 and 1996, 9-year-old students' average performance in reading rose by 4 points on a 500 point scale. Yet average score increases for each of the subgroups - blacks, Hispanics, and whites ¾ exceeded the overall average increase. Why? Blacks and Hispanics, the lowest scoring subgroups represent a greater share of the total population in 1996 compared with 1971, which had the paradoxical effect of lowering overall gains even as each group's performance improved.
Framework-based Assessments in Mathematics, Reading, and Science
In addition to, and separate from ,the long-term trend assessments, NAEP also provides cross sectional data based on grade level student samples. These reports, called "The Nation's Report Card", involve more recently developed testing instruments. Instead of repeatedly using the same sets of questions and tasks necessary to generate trend data, the Nation's Report Card is framework-based, that is they reflect the best current thinking about what all children should know and be able to do. Each of these framework-based assessments is based on different sets of questions or tasks; therefore, the results from each cannot be directly compared.
Mathematics. The NAEP 1996 mathematics assessment continues the commitment to evaluate and report the educational progress of students at grades 4, 8, and 12. Like previous NAEP mathematics assessments in 1990 and 1992, the 1996 assessment uses a framework influenced by the Curriculum and Evaluation Standards for School Mathematics of the National Council of Teachers of Mathematics (NCTM). The 1996 framework was updated to more adequately reflect recent curricular emphases and objectives.
The framework characterizes the mathematics domain in terms of five content strands -- number sense, properties, and operations; measurement; geometry and spatial sense; data analysis, statistics, and probability; and algebra and functions. Across the five content strands, the assessment examines mathematical abilities (conceptual understanding, procedural knowledge, and problem solving) and mathematical power (reasoning, connections, and communication). The positive news is that national data from the 1996 mathematics assessment showed progress in students' mathematics performance on a broad front, as compared with both the 1990 and 1992 assessments.