Expecting Too Much of Performance Pay?

A study of four school districts uncovers the messy reality of adding merit bonuses to the salary scale by SUSAN MOORE JOHNSON AND JOHN P. PAPAY

It makes sense to pay people for how well they do their work, to separate the strivers from the slackers and those who deliver from those who don’t. Many school reformers who share this view are convinced that well-placed dollars will attract ambitious recruits, motivate current teachers, reward those who succeed and reassure the public their education dollars are being well-spent. Carefully designed and implemented pay reforms may well improve public education, but it remains an open question whether the incremental strategies that reformers are promoting today can have such far-reaching benefits.

Susan Moore JohnsonSusan Moore Johnson is the Jerome T. Murphy professor in education at the Harvard Graduate School of Education in Cambridge, Mass.

Pay for performance is not a new idea, and reformers should not ignore the dismal record of merit pay over the past century. Initially adopted with a flourish of expectations during several waves of popularity in the past, every plan eventually fell into disuse. These plans proved to be unexpectedly costly and cumbersome to run. They often generated cynicism and unproductive competition among teachers.

In part, these approaches also fell short because school administrators were responsible for choosing the best teachers. When the winners were announced, many teachers who considered themselves more deserving complained about favoritism and rejected the process as rigged. They also didn’t know what they would have to do to earn an award. Because of these issues, these plans failed to produce their intended results.

Recent developments, though, may change the prospects of using pay to reward teachers’ performance. First, widespread standardized testing and new value-added methods for analyzing test-score data promise to provide more objective evidence for school officials seeking to reward teachers for the quality of their work.

Second, some school districts have adopted standards-based approaches to staff evaluation, which specify the components of effective teaching practice. These standards-based instruments guide evaluators as they visit classes and assess teachers’ work. In districts that provide adequate training and time, standards-based evaluation has increasingly gained acceptance as a meaningful and even-handed process.

Finally, over the past decade, a new generation of teachers has brought into schools a view of pay that differs from the one held by their veteran colleagues. These new teachers appreciate the certainty and stability of the standardized salary scale but also want to be recognized and rewarded financially for what they accomplish in the classroom.

Plan Designs
Given these developments in methods and attitudes, we can ask whether today’s pay reform plans will fare better than those of the past. To better understand the experiments under way, we reviewed programs from across the country that pay teachers for how well they teach and what their students achieve. We studied four school districts’ programs in more detail — Houston, Minneapolis, Charlotte-Mecklenburg, N.C., and Hillsborough County, Fla.

These four school districts’ pay plans — and, indeed, virtually all performance pay plans currently in use — attach perform-ance bonuses to the single salary scale rather than replacing it. They still primarily reward teachers in a steps-and-lanes scale for loyalty to the school district and for pursuing advanced courses and degrees. However, these plans also offer opportunities for teachers to earn more money based on their performance. They differ in how they assess teachers’ performance and how they award the bonuses.

•  Test scores or classroom observations All the school districts we studied used test scores as one way to compare teachers’ performance. Houston rewards teachers, with more than $10,000 in available bonuses, only for their students’ test scores. These approaches, which use value-added methods to compare teachers’ success, are attractive because they directly measure student learning, which all the interested parties care about.

However, these approaches have some shortcomings. They measure only a part of what we expect teachers to do. That is, they can tell us whether a teacher succeeded in teaching ratios, but not whether the teacher’s students became better citizens. As a result, using test scores to award bonuses may divert teachers’ attention from important, though untested, responsibilities. Further, it may encourage teachers to “teach to the test,” raising students’ test scores without actually increasing their learning.

Also, because standardized test scores are available only for some teachers in some grades and subjects (generally elementary teachers of math and English language arts), different selection methods must be used if the rest of the district’s teachers are to be included in the pay plan. Moreover, school districts must have sufficient capacity to manage and analyze the data, which can be a major undertaking.

Another shortcoming stems from evidence that, although they are statistically complex, value-added methods do not account sufficiently for the fact that students are deliberately, not randomly, assigned to classes. Some teachers may get the students who make the most progress and, thus, win the awards while their peers, assigned students who progress more slowly, do not.

Early experiences with value-added approaches suggest that teacher ratings tend to be unstable over time, so a teacher deemed highly successful one year may appear to be below average the next. In Houston, one teacher reported that, without changing his teaching practice, he earned no bonus one year and a $7,590 bonus the next. As a result of such inconsistencies, teachers called several programs we studied “lotteries.” Although they played them in the hope of winning extra pay, the teachers didn’t accept them as true assessments of their classroom performance. Importantly, teachers said they did not know what changes they had to make in their teaching to earn a bonus.

An alternative to using value-added methods to analyze test-score data is to assess teachers’ work in the classroom with standards-based evaluation instruments, which can provide richer, more detailed assessments of their work. Hillsborough County combines test-score measures with ratings on teacher evaluations to determine who earns their awards. Importantly, observations with detailed feedback can help teachers understand what they need to improve, which value-added methods never can.

However, standards-based evaluations are expensive, requiring money and time to train evaluators. Moreover, even with training, many principals fail to complete evaluations, whatever the approach. Thus, if done poorly, classroom evaluations, even those that are standards-based, will yield little information about teachers’ performance. 

•  Competitive or standards-based awards Some programs rated teachers or schools competitively and rewarded the top performers. Hillsborough County ranked all teachers in their grade and subject areas and in 2006-07 paid a $2,100 bonus to the top 37 percent of teachers. Programs that set quotas on the number of bonuses they award promote competition, which some see as a good thing and others say is unhealthy because schools depend on teachers’ helping each other.

Alternatively, awards can be made to all teachers or schools that achieve a certain level of success. Everyone can win with this approach, and collaboration is not compromised. However, the plan must set a high performance standard if it is to change teachers’ behavior. Furthermore, whereas the expense of a competitive ranking plan is predictable for district administrators, standards-based plans are open-ended and costs are difficult to estimate in advance. To cope with this problem, Charlotte-Mecklenburg has limited its financial risk in a standards-based plan by dividing a fixed award budget among all schools that meet the standard.

•  Individual or group awards Planners also must decide whether to grant awards to individuals or to groups of teachers (the grade level, department, cluster or school). From a practical standpoint, individual awards are more targeted and provide more powerful incentives for change, while group awards are more diffuse, allowing some teachers to ride freely on the hard work and accomplishments of others. The public, who know from personal experience that some teachers are much better than others, tend to favor individual awards. Many reformers promote pay for performance because they expect weak or lazy teachers to get discouraged and leave when they fail to win awards.

Despite the attractiveness of individual awards, group awards more accurately match the reality of schooling. Because students move from class to class and grade to grade, it’s virtually impossible to attribute a student’s achievement in any subject at any time to a single teacher. Even in many elementary schools, teachers switch students during the day, teaching the topics they know best. One might teach science, another reading, thus allowing students within a grade to get the best of both teachers’ knowledge and expertise. A plan geared to the group recognizes and rewards these teachers’ collaborative efforts and achievements, whereas an individualized plan would not. Group awards also prevent competition among teachers that could result from individual awards.

Adopting Reforms
No sure answers exist to these design decisions. Each choice presents opportunities and challenges. As a result, the process of creating and adopting pay reforms is harder than many expect.

Pay reform in K-12 education today is being driven by many pressures, both internal and external, and planners should be clear about what they hope to achieve. A district that is primarily concerned with increasing public approval might choose a plan that makes competitive awards to individuals based on their test scores. However, if a district hopes to build instructional capacity in all schools, it might better adopt a standards-based plan, using both evaluations and test scores to make team-based awards.

However, a district’s resources and current capacity also matter. If funds are short and awards are small, teachers may be insulted by the pay initiative and reduce their effort rather than work harder. If districts lack evaluations for all teachers or if those evaluations are conducted haphazardly, then districts cannot fairly use them as the basis for awards. Conversely, if districts lack the data or expertise they need to generate value-added results, it would be unwise if not impossible to reward individual teachers for improving students’ test scores. Furthermore, if the reforms are truly to influence what teachers do, they must win wide acceptance. Union leaders in these districts tended to welcome the additional income for their members.

Although they sometimes doubted that the plans could improve the schools, they were willing to work with management and give pay-for-performance initiatives a try. The Minneapolis program — like the ProComp program in Denver — benefited because teachers were involved in the program’s design. Teachers reportedly appreciated the additional income that the plans provided, even if they, as individuals, did not receive it.

In fact, the reality we found in these districts was messier than we expected, suggesting they were not relying on a single, comprehensive strategy. Most districts sponsored several programs with multiple goals and approaches. They had seized on funding opportunities as they became available. However, each source of funding came with different priorities and requirements, which were folded into the district’s pay options. When Florida announced its Merit Award Program, the Hillsborough County teachers’ union and district administration saw it as a way to increase teachers’ salaries. Because the state required it, they developed a program that included student test scores as a major factor in rating a teacher’s performance. Having multiple plans also suggested the districts were covering their bets, investing in several portfolios in the hope that at least one would pay off. For example, Minneapolis provides teachers with professional growth credits that can lead to permanent salary increases for a wide range of measures, including individual perform-ance evaluations, student test scores and school-based measures.

These school districts gave teachers several chances to win, perhaps hoping to keep the peace and retain their commitment. In addition to awarding bonuses to a few outstanding individuals, some districts grant group awards to a larger number of teachers. It is hard to assess the effects of these mixed plans. Combined, differing strategies might pay off or they might work at odds with one another, in the end achieving little.

We do not know whether these programs improved teachers’ practice or students’ performance, but we did learn some things about how teachers responded. Teachers’ annual pay often included both base salary and several other awards for different individual or group actions and accomplishments. In some cases, the additional pay was substantial, as much as $10,000. With so many options, each having different odds and payoffs, teachers might focus their attention on how to maximize their pay overall, rather than how best to teach their students or improve their school.

Unrealistic Expectations
The school districts we studied often faced external political pressures or had funding opportunities that required them to move quickly. As a result, these plans were not as successful as they might have been had the districts defined their purposes more clearly, tailored coherent plans with incentives aligned to achieve those purposes and built support through a collaborative design process. More time and less pressure would have helped.

However, pay reformers also continue to have unrealistic expectations about what incremental bonuses can do. Like their predecessors in decades before them, most reformers continue to conceive of pay in a narrow, mechanistic way, as a single lever with extraordinary powers to attract able teachers, extract effort, improve skills and promote growth. Unless the awards are substantial and teachers have a clear understanding of what they need to do to earn an award, today’s bonuses are unlikely to have such effects.

Instead, the money spent on payoffs for individual successes might better be invested in developing teachers and schools. Funds might be used to improve teachers’ knowledge and skills or build increased instructional capacity at the school level. Experience tells us that neither of these is easy to do well, that large sums of money are wasted paying for irrelevant courses and misused planning time for teachers. However, aiming for more comprehensive reform is likely to have greater payoff for students than continuing to develop isolated bonuses that reward individual teachers, however attractive those approaches seem to be.

Susan Moore Johnson is the Jerome T. Murphy professor in education at the Harvard Graduate School of Education in Cambridge, Mass. E-mail: susan_moore_johnson@gse.harvard.edu. John Papay is a research assistant with the Project on the Next Generation of Teachers at Harvard. Their article is drawn from their co-authored book Redesigning Teacher Pay: A System for the Next Generation of Educators (Economic Policy Institute).