WASHINGTON — In the first large-scale analysis of new systems that evaluate teachers based partly on student test scores, two researchers found little or no correlation between quality teaching and the appraisals that teachers received.

The study, published Tuesday in Educational Evaluation and Policy Analysis, a peer-reviewed journal of the American Educational Research Association, is the latest in a growing body of research that has cast doubt on whether it is possible for states to use empirical data in identifying good and bad teachers.

“The concern is that these state tests and these measures of evaluating teachers don’t really seem to be associated with the things we think of as defining good teaching,” said Morgan Polikoff, an assistant professor of education at the Rossier School of Education at the University of Southern California. He worked on the analysis with Andrew Porter, dean and professor of education at the Graduate School of Education at the University of Pennsylvania.

The number of states using teacher-evaluation systems based in part on student test scores has surged during the past five years. Many states and school districts are using the evaluation systems to make personnel decisions about hirings, firings and compensation.

Thirty-five states and the District of Columbia require student achievement to be a “significant” or the “most significant” factor in teacher evaluations. Most states are using “value-added models” – or VAMs – which are statistical algorithms designed to figure out how much teachers contribute to their students’ learning, holding constant factors such as demographics. Polikoff and Porter analyzed a subsample of 327 fourth- and eighth-grade mathematics and English-language-arts teachers across six school districts in New York; Dallas; Denver; Charlotte-Mecklenburg, North Carolina; Memphis, Tennessee; and Florida’s Hillsborough County.

The researchers found that some teachers who were well-regarded based on student surveys, classroom observances by principals and other indicators of quality had students who scored poorly on tests. The opposite also was true.

“We need to slow down or ease off completely (on teacher evaluation models) … so we can get a sense of what do these things measure, what does it mean,” Polikoff said. “We’re moving these systems forward way ahead of the science in terms of the quality of the measures.”