Mixing Technology and Testing

Computer-based assessments lend flexibility, quick turnaround and lower costs, supporters say by Alexander Russo

It didn’t take long for Catherine McCaslin, the evaluation and assessment specialist for the Beaufort County, S.C., public schools, to realize that moving to a computer-based testing program was going to have tremendous benefits.


During an early demonstration of the school district’s innovative touch-screen, audio-enabled testing program at a local school nearly six years ago, McCaslin saw firsthand that increased student motivation, among many other benefits, was one immediate result of mixing technology and testing.

“The response of these children and some parents who were viewing the testing to the [computer] test told me then and there that this was the only way to go,” McCaslin recalls. “Why not make testing fun? Whoever said that a test had to be boring and quiet and black and white to be a valid classroom assessment tool?”

While still far from widespread, computer-based testing is the choice of a small but enthusiastic band of districts like Beaufort County that are trying to improve cumbersome paper-and-pencil testing programs and integrate computers more completely into classroom curriculum and instruction.

Computer-based testing can provide flexibility, instant feedback, individualized assessment and eventually lower costs than traditional paper examinations. Computerized results create opportunities for teaching and assessment to be integrated more than ever before and allow for retesting students, measuring growth and linking assessment to instruction.

So far, district experiences have been mostly positive, and proponents predict that computerized testing will be widespread within just a few years. Some districts are implementing and designing their own programs focused on individualized diagnosis and improved instruction, while others are piloting state-led efforts to bring state assessments online. Others, such as the Meridian, Idaho, School District (see related story), are contracting with outside testing firms.

Gaining a Foothold

Like many other still-unfolding technology initiatives, computer-based testing takes several forms and falls under several different names, including online assessment, computerized testing, electronic testing and computer adaptive testing.

For several years now, computer testing has been creeping into other aspects of public life. Some states administer drivers’ license exams electronically. Some employers use computer testing to screen job applicants. Already, millions of online tests are administered each year in the military, the private sector and postsecondary education and by professional certification groups.

Computer versions of college placement tests and graduate school exams such as the GRE, GMAT and the Test of English as a Foreign Language are all available via computer, as is the Educational Testing Service’s Praxis I for new teachers. An estimated 150 companies provide computerized testing programs of some kind, though few have a proven track record in K-12 education.

No single delivery form of computer-based testing exists. Sometimes the tests are housed on local servers or on the Internet, sometimes on diskettes or a hard drive. The source of test questions varies, too. Sometimes they consist of classroom teachers’ homemade designs, while others are drawn from banks of state, national or proprietary test items. Sometimes computer tests involve written responses that require keyboard use. Some electronic tests give every student the same set of questions, while others adapt to each student’s responses, giving harder or easier questions as the test proceeds.

No large school district or state system has yet moved completely into computer assessment. But there are systemic efforts linked to state standards—much more than just putting practice tests online or an end-of-unit test. In some cases, these tests are linked to performance and accountability systems that affect how schools are rated.

Rapid Turnaround

What nearly all computer-based tests have in common is that scores and detailed reports are often available within hours or days, if not immediately at the end of the test session. This short turnaround time, perhaps more than anything else, is a compelling feature of computer testing. Nearly immediate test results allow teachers to adjust instruction and enable administrators to adjust courses and groupings of students.

Computer testing also addresses many drawbacks of current testing practices, which include scoring errors, lost mail, postage and handling expenses, diminishing classroom instruction time and the high costs of human scorers for written exams. Like a Polaroid camera whose prints develop in minutes, computer-based testing programs can provide near-instant gratification to superintendents, board members, teachers and parents who have grown increasingly frustrated at long waits for test results.

For teachers and board members, getting test scores back almost immediately is one of the most worthwhile features of computer-based testing, says Judy Jones, director of assessment in Cobb County, Ga. “When we are able to talk to them about the immediacy of the results, that really makes them say, ‘OK.’”

But speed is far from the only appeal. Other potential benefits include individualized reporting, flexible scheduling, shorter administration times, better student motivation and the hope of lower costs. Just imagine: No more test booklets to mail, no more mass interruptions of the school day to test whole grades at a time. No need to test students far before the end of a unit or semester.

Computer-based testing offers the promise of a highly flexible system with a minimum of classroom disruptions. In theory, computer tests can be administered anywhere there’s a computer, at any time of day. Special education accommodations like audio or large-type or extra time become simpler to provide. Once limited by cost and their cumbersome nature, digital portfolios, lab simulations and problem-solving activities become easy options. Even essays and open-ended questions can be scored by computer (see related story). A decrease in scoring errors and an increased speed in test-score reporting are two other key advantages.

Statewide Exams

Educators are using computer-based testing for various purposes, divided into two main camps. Some educators, mostly at the state level, look to computer-based testing as a faster way to deliver and score pencil-and- paper assessments. Many at the district level see it as a way to improve diagnostic testing and to explore skills and abilities not readily addressed on most existing tests.

The most straightforward use of computer testing is for traditional performance and accountability functions. Thus far, a small but increasing number of states, including Georgia, North Carolina, Oregon, Pennsylvania and Virginia have taken steps toward implementing computer-based testing systems for this purpose. Next on board could be Florida, Kansas, Kentucky, Maryland, Michigan and Utah.

In January, Idaho announced plans to administer fall and spring exams electronically in mathematics, reading and language arts. Immediate results for students, teachers and administrators was key to the state’s decision. Still to be resolved, though, is whether the innovative testing plan would meet federal testing requirements, which are linked to millions of dollars in federal aid.

Perhaps the most widely publicized example is taking place in Oregon, where assessment already has begun migrating toward computer-based testing for its statewide exams in grades 5, 8 and 10. First piloted in 2000-2001, 300 districts participated in a linear (or “fixed-form”) multiple-choice exam this year, and 700 more will take the computer version next year.

In Virginia, 10 districts participated last spring in the state’s first-ever computer administration of the statewide accountability exam. By 2003, 100 percent of its Standards of Learning tests will be online. For its younger students, Virginia also is developing a computer adaptive measure of algebra readiness that will be unveiled this spring and soon administered to 200,000 students.

In Georgia, roughly 10 percent of students in grades 1-8 will take the state’s criterion-referenced tests online this spring. Considered by some to have one of the most advanced CBT models, Georgia is also planning to put at least eight high school tests online within the next few years and may include adaptive components in the future.

Since 2000, North Carolina has implemented an adaptive version of its state reading and mathematics assessments for special education students who would otherwise not have participated in the program and is considering opening the CBT program to mainstream students. Administered more than 22,000 times last year, the tests have a wide scale of ability levels, and need only 25 questions to get a score instead of the usual 80.

Many of these programs are the result of leadership from the top and a comprehensive view of technology in education, says Randy Bennett, a computer testing expert at the Educational Testing Service. “Testing is a piece in most of these cases of a larger vision that has to do with giving students and teachers a way of using technology. … And, by the way, we can do testing in addition.”

Increasing state and federal requirements for annual testing are also one of the main motivators for computer-based testing, says Adam Newman, director of research for the Boston-based research firm Eduventures. “States are going to have to begin to move toward much more frequent testing. That will accelerate peoples’ interest in or willingness to consider CBT.”

Annual testing in reading and math in grades 3-8 is required for all states under the new federal education law, “No Child Left Behind.” In addition, the law mandates that states provide disaggregated performance data and promptly report state, local and school performance data. To help pay for the increased costs, Congress has appropriated $387 million for state testing costs next year. The new law also requires state participation in the National Assessment of Educational Progress.

ETS’s Randy Bennett agrees that the new law will push educators further toward CBT. “Paper tests can't be scored quickly or inexpensively when they consist even partly of the types of tasks kids do in the classroom (i.e., essays and other sorts of open-ended problems),” he says. “CBT can (in principle) help with those problems, making the scoring of open-ended tasks (as well as multiple-choice items) faster and cheaper.”

Improving Diagnosis

Other states and a handful of school systems are using computer-based testing for diagnostic and curricular purposes at the classroom and district levels. Many districts already have limited experience with older forms of this type of computer-based testing, either through standalone tests and simulations or through assessments embedded in larger computer-aided instruction packages. The main difference here is that the tests are more closely linked to district or state standards.

Over the past six years, the 17,300-student Beaufort County, S.C., district has developed its own home-grown computer-testing program, featuring touch-screen technology and audio elements for students still learning to read. The Beaufort program tests students in English, math and science, and at its height was being used by nearly every teacher in every school.

To help make the computer testing program part of teachers’ classroom programs, the district has recruited teachers to help write test questions and ensured teachers had easy access to the testing data that they generated. “This is a classroom assessment tool, not a district-level accountability system,” says McCaslin, Beaufort’s assistant superintendent.

Under the leadership of Superintendent Frank Holman, the 2,300-student Arkadelphia, Ark., Public Schools has implemented an adaptive computer testing program using several funding sources, including Title I and funds generated through a $7 million bond campaign. Using early test results, Arkadelphia is targeting struggling students for additional instruction and afterschool tutoring.

“I saw the [computer-generated] reports at a session and knew that this was the answer for our teachers,” remembers Holman. “A way to have the classroom professional access data for decision making for all children.”

The Bloomfield, N.M., district, with its 3,300 students, has been using frequent online testing for more than two years. By testing throughout the year and merging assessment with professional development, “I know through the year how these kids are doing and how I need to change my practice,” says Superintendent Harry Hayes. Given a highly mobile student enrollment, timely results are all the more important, he adds.

While teachers are given a choice, “the lab is so much more efficient,” says Hayes. “You bring the students down there, use ‘x’ minutes and then it’s completed. The results are immediately available.”

More than 375 school districts and the state of Idaho have adopted CBT developed by Northwest Evaluation Assessment. More than half of the districts NWEA works with, and more than 75,000 students in 100 Edison schools participate in a monthly online testing program. South Dakota, one of the most wired states, and Georgia are implementing CBT programs with strong diagnostic functions. In 2000-2001, South Dakota administered almost 20,000 online tests.

District Experiences

Most districts are enthusiastic about their CBT programs, though getting them to work has not been without its challenges.

In Beaufort County, for example, the struggle has been to maintain the program as technology shifts from standalone computers to the Internet.

“When our first computer-based pilot was launched, we had not yet gone to networks,” says McCaslin. “The testing system was used on standalone computers. We actually trucked 100 full workstations (computer, touch-screen monitors, peripherals, wiring) to groups of schools for a week at a time, had every K-2 child tested in reading and then again in math, load the equipment at the end of the week, move on to the next group of schools. We did this for four weeks.”

Beaufort has struggled to increase teacher use of the system because of various workstation incompatibility problems associated with its desktop design.

In Pennsylvania’s first two pilot tests of a computer-based writing exam, participating districts had problems with outdated browsers, network and bandwidth limitations and technological inexperience of teachers and local computer support staff.

Georgia districts that participated in early field trials last year encountered a number of glitches, including mislabeled items in the bank of test questions developed by the state for teachers to use in classroom assessments, according to Cobb County’s Jones. The county sent three schools to participate in the pilot and has used a computerized diagnostic reading program for several years. “At that point, it really wasn’t operational.”

During its first year, the North Carolina program also ran into a series of problems, even though it has become popular with teachers as a way to include special education students in the testing. Measurement items on the math test were problematic because of variations among computer monitors across the state, says Mildred Bazemore, one of the architects of the plan. Server and firewall problems made for a frustrating experience for some students, adds Lou Fabrizio, the state’s accountability director.

But not all field tests go poorly. At first, students in Ypsilanti, Mich., breezed through the computerized tests without even reading the questions, says Noni Miller, director of educational services for the district. But then when they saw their scores at the end they wanted to retake the test. “If you’ve ever seen a child take an online assessment, they’re 10 inches from the screen. They’re into it,” she says. “Many kids take tests but they don’t own the tests. Now, the kids are starting to take our tests a little bit more seriously than before.”

In Hanover County, Va., there were relatively few bugs. “As it turns out, the contractor that has received the statewide contract was the one who was present in our school,” says Stewart Roberson, superintendent. “It makes a great difference if you have a sensitive tech contractor.”

And in Bloomfield, N.M., superintendent Hayes credits computer-based testing for recent improvements in student achievement in his district as well as Blue Ribbon School status for an elementary school. “Because the computer-based test is a frequent and easy means of checking learning and guiding teaching, we see more improved results and efficient use of time.”

Clarity of Purpose

For all of its obvious appeal and potential, the further spread of computer-based testing faces some difficult, though not insurmountable, challenges. Most experienced administrators and national experts agree that, if you’re thinking about bringing in a CBT program or piloting one being adopted by your state, you need to consider a few things first.

“The first thing that anybody has to think about is the purpose,” says John Olson of the Council of Chief State School Officers, who coordinates a large-scale assessment task force. Major differences exist between programs set up for diagnostic purposes and those used for accountability, as well as between linear and adaptive tests. Some mixing and matching is possible, but, says Olson, “You need to be really clear about how you’re using it.”

Another obvious factor to consider before embarking on an online testing program is how sensitive the whole issue of testing might be in your community—and how online testing might exacerbate concerns about fairness, access, accuracy, privacy and validity. Some research suggests that CBT gives an advantage to some students more than others. Comparability is an issue with any new test, but especially so with adaptive tests in which no two students are given the same set of questions.

With a sufficiently large item pool, adaptive tests can be administered frequently without the security concerns of a linear or paper-and-pencil test, explains David Harmon, director of research, evaluation and testing for the Georgia Department of Education. “The downside is that you’re going to have to explain to parents that their students all had different tests,” he says. This may also be an issue at the federal level, given the new law’s requirement that students in each grade take the same test.

Like sampling districts instead of counting every person on the census, adaptive testing violates the tradition of giving every student a chance at every question.

These are some of the reasons that the National Assessment of Educational Progress, the ACT and SAT, Advanced Placement and other high-stakes tests are not yet available online. Testing is “such a politically sensitive topic with such dramatic ramifications for students, administrators, districts—everyone involved,” says Newman, research director with Eduventures. “CBT raises the anxiety level for everyone to another level."

Test security and privacy issues are a particular concern when it comes to mixing exams with computers and the Internet, though concerns about Internet security are exaggerated in most cases, according to ETS computer testing expert Randy Bennett. While no major incidents have taken place in K-12, perhaps the most well-publicized cautionary tale took place in the mid-1990s when the online adaptive version of the Graduate Record Exam had to be suspended temporarily when it was revealed that the use of an insufficient number of test items allowed test takers to memorize enough items to make a difference in test results

It’s also important to consider whether the money spent to bring in computer-based testing is a top priority. While some states provide funds or equipment to districts to help implement CBT, a certain amount of the cost is going to be borne at the local level. “I’m not sure everybody realizes what it costs to do a complete system, from soup to nuts,” says CCSSO’s Olson, adding that the costs are certain to supersede those of a paper-based system. “Over the long run, the cost savings will be realized and you’ll be better off,” he says. “But the startup costs are huge.”

Superintendent Harry Hayes in Bloomfield, N.M., says it’s important to realize “computer-based testing is only a means for a means.”

Last but not least, nearly everyone agrees that moving to computer-based testing is a thing best done in small steps. “Everybody’s learning a lot when they pilot,” Olson says. “They’re learning about the capabilities and lack of capabilities of the system—from the phone lines to the fiber optics to the wiring in the schools or the types of computers.”

“As sure as night follows day, some states and districts will get into trouble because they didn't plan carefully enough, they moved too quickly or they started with the wrong type of test,” says Bennett.

Last year’s pilot in Virginia was quite modest in scope, according to Hanover County’s Roberson, usually including just one high school in each pilot district. It enabled faculty to learn about the program while the state auditioned candidates for the statewide contract. The tests did not count formally in terms of student promotion or school performance. Eventually they will make a tremendous difference, Roberson believes.

“Children and their parents must have real-time results, enabling them to at least retake a test in the immediate timeframe if necessary or turn on a dime to make other curricular decisions about their high school journey in a timely fashion.”

Future Directions

Some observers expect use of computer-based testing to rise steeply over the next few years.

Already on the horizon are computer programs that can electronically score essays and short-answer responses, a function currently in use with the Graduate Management Admission Test and being piloted in states like Pennsylvania.

Infrared communication and handheld personal digital assistants are another cutting-edge approach being tested in some districts. Cobb County, Ga., is looking at piloting an infrared cart and handheld PDAs at a couple of schools later this year, says Judy Jones, director of assessment for the district. Some experts predict that the PDA-based approach to computer testing is extremely promising, especially with recent advances in handwriting recognition software.

At some point, other tests may be available in computer form. NAEP already has completed its first field test of a CBT in mathematics and is planning two others in reading and problem solving. Early last year, 13 high schools nationwide piloted a computerized version of the SAT. An adaptive version of the SAT already is used to give more than 8,000 talent search tests a year. And according to at least one source, online versions of other commercial examinations are in development.

Perhaps most importantly, more information is on the way to help administrators figure out what others are doing and how to proceed wisely. Thus far, most of those involved in CBT have been relying on vendors or anecdotal information to guide their decisions. Virginia is credited with having developed one of the first and most reflective request for proposals before it began its CBT program. But sometime this spring NAEP is planning to issue its first report on the online math test it studied last year.

The Education Commission of the States recently posted a large amount of helpful information on its Web site. CCSSO soon will release its survey on computer-based testing in the states. And the Appalachia Educational Laboratory, a federally sponsored research organization that hosted a conference on CBT in late 2000, has recently developed a set of critical questions to guide decisions and development related to computer-based testing.

However, other observers suggest that a wait-and-see attitude will prevail, and they question just how strong interest in computer-based testing really is—especially in a large-scale, high-stakes context. One concern is that while testing requirements will likely continue to increase, funding to pay for frequent testing and a migration to CBT may not be sufficient. Another concern is that educators may want simpler, lower-stakes forms of computer tests—just as consumers wanted the simple PalmPilot instead of fancier models—rather than going straight to large-scale, high-stakes programs.

“The question becomes: Are some of these technologies and ideas a little early, a little too far ahead of market demand?” says Eduventures’ Newman, who is one of many who predict that smaller, lower-stakes efforts are likely to dominate in the near future.

But for many of those like Hanover County’s Roberson, who participated in the Virginia pilot study, the verdict is already clear. “We would do it again in a heartbeat.”

Alexander Russo is a Chicago-based education writer. E-mail: AlexanderRusso@aol.com