Punchback: Answering Critics

The Myopia of Testing Basic Skills Alone

by Richard Rothstein

Americans want students with proficiency in math and reading. But schools also should teach citizenship, social responsibility, cooperative problem solving, a work ethic and appreciation of the arts. And youth should leave school in good physical and emotional health, with habits and knowledge to sustain that health into adulthood.


PB-RothsteinRichard Rothstein

Yet we now evaluate schools by test scores of basic skills alone. If schools are accountable only for these meas-ures, they will focus greater effort on raising them, giving short shrift to other goals of a balanced education.

Some say it is too difficult to assess performance in anything other than basic academic skills.

This turns out to be a poor excuse. The nation once measured performance in a broad range of knowledge and skills. The effort was abandoned, not because it was too difficult but because, in the 1970s, Congress and the president considered it too expensive.

A Broader Examination
In the early 1960s, the federal government asked testing experts to create the National Assessment of Educational Progress, also known as the “Nation’s Report Card.” Led by curriculum reformer Ralph Tyler, the experts vowed to avoid distortions that would inevitably result from testing basic skills alone.

Tyler, who chaired the Carnegie Corporation-funded committee that developed NAEP, earlier had written that although exams can assess some educational objectives, others are “more easily and validly appraised through observations of children under conditions in which social relations are involved.” Assessment also should include a collection of actual student productions, such as paintings or essays. If a school’s reading program aimed to develop mature interests, evaluators should examine what books students checked out of libraries and inquire whether they read newspapers.

NAEP was designed to test only a representative sample of students because a sample can disclose how students generally perform. Because NAEP aimed to assess many broad domains, no student was given many exercises. National (and later, state-level) results would be based on combining many students’ answers on different items.

NAEP began testing in 1969. Its sampling philosophy is still in place today, but its ambitious topic coverage has been forgotten.

For example, to gauge whether students were taught to cooperate in small groups, early NAEP sent trained observers to work with 9-year-olds. In teams of four, students were offered a prize of crayons or yo-yos to guess what object was hidden in a box. The students could ask yes-or-no questions; two teams competed to identify the object first. Team members had to agree on each question asked; the role of posing questions publicly was rotated.

NAEP rated students on whether they suggested questions to ask, gave reasons for their viewpoints or otherwise helped teams to succeed. When the government then reported on the percentage of 9-year-olds who performed satisfactorily, the public gained understanding of whether schools were teaching cooperative problem solving.

For 13-year-olds, NAEP presented groups of eight with a dozen issues about which teenagers typically had opinions, such as whether they should have strict bedtime limits or be allowed to watch adult movies. NAEP asked students to reach consensus and write a recommendation to resolve two such issues. Assessors observed these groups, rating whether students took clear positions, gave reasons for points of view, helped organize internal procedures, defended the right to hold contrary viewpoints and consistently remained on task. NAEP then reported only 4 percent of 13-year-olds defended the right of another to hold a different opinion, while only 6 percent were willing to defend their own viewpoints in the face of opposition. This was a clear warning to educators that they were falling short in teaching such traits.

Misdirected Spending
NAEP also used paper-and-pencil tests to assess character. In the 1970s when racial segregation was still common, NAEP assessed students’ citizenship traits by asking adolescents what they would do if they saw other children being barred from a park because of their race. Correct answers included reporting the incident to parents, teachers or civil rights organizations, or writing a letter to newspapers. In this case, 82 percent of 13-year-olds gave acceptable illustrations of actions they might take.

Early NAEP was filled with such examples, efforts of the Nation’s Report Card to assess school results in all curricular areas. It was expensive; it took time and money to train personnel to observe student behavior and ensure different assessors gave similar ratings to similar performances.

We actually spend a lot more on NAEP now, in constant dollars, than we did 30 years ago. New money has gone into ever more sophisticated testing of basic skills. None has been returned to NAEP’s early effort to assess a broad range of cognitive and noncognitive goals.

Our failure today to assess student performance for all the knowledge and skills we expect schools to teach does not result from a shortage of funds or ignorance of technique. It only results from a failure of our vision.

Richard Rothstein is a research associate of the Economic Policy Institute in Washington, D.C. The column is adapted from his book Grading Education: Getting Accountability Right (Teachers College Press, 2008), co-authored with Rebecca Jacobsen and Tamara Wilder. E-mail: riroth@epi.org