BOOTHBAY HARBOR – Evaluating teachers by testing their students is an appealing idea on rational grounds. It makes good sense to rate teachers by how much their students learn, as several recent letters and columns have argued. In practice, however, it is fraught with pitfalls, for two reasons:

• First, there are few, if any, tests adequate to the task. What we want to measure is not only the students’ in-depth knowledge of the subject matter and the skills involved, but also — and more important — their devotion to the subject and their desire to continue learning it for the rest of their lives.

If in studying mathematics, for example, a student learns only to hate it, a great disservice has been done no matter how high his or her test scores are. I have always shunned the study of languages, and even foreign travel, as a result of a sadistic Latin teacher I had early in high school.

But after 36 years of teaching and directing research at Educational Testing Service in Princeton N.J., I don’t know of any objective tests of such outcomes that are feasible for classroom administration.

Interviews with students and evaluation of portfolios assembled by students provide useful information about the students’ attainment. Few schools, however, can afford such procedures on a large scale.

• Second, discerning how much impact a teacher has had on a classroom of students requires measurement at the beginning of their association and at the end, typically at the beginning of the school year (September) and at the end of the school year (the following May or June).

Judging a teacher on the basis of one set of scores obtained early in the year is grossly unfair. These student scores are the product of a tsunami of variables, including previous schooling, genetic factors, and — especially — home background, as pointed out in an excellent recent letter.

A classic study — a reprint of which is lost in my files — found that the largest correlate of students’ test scores was their mothers’ educational level. Waiting until the end of the year is better, but how do you know whether high scores reflect what the students started the school year with or the result of a year of good teaching?

This suggests using scores showing any gain from the beginning of the year to the end. This is better than one-shot testing, although a caveat is necessary: For statistical reasons, gain scores tend to be unreliable and are difficult to interpret.

Which is better: A large gain for a marginal student with low scores initially or a small gain for a gifted student with scores hitting the ceiling of a test? Complex statistical procedures can obviate some of these problems although they are beyond the means of most schools.

Does this mean that testing makes no contribution to teaching? Absolutely not. Test scores tell teachers which students need help and where help is needed. And they also can tell school boards which schools need a bigger budget. Or a new principal.

But in evaluating a teacher, priority should be given to expert judgment. Principals and department heads worthy of their position know which teachers care about their students and know the strengths and needs of each one, which teachers are dedicated to what they teach and have advanced knowledge in the field, and which teachers painstakingly plan their lessons.

They also know which teachers despair of their profession and are counting the days till retirement. And if they don’t know this, they should be encouraged to seek employment elsewhere.

In conclusion, evaluating teachers is a complicated process with no shortcuts. Test scores can help but only if prudently used.

One final point: Several times here, what I have said or implied ought to be done is presently beyond the means of most schools.

To which the obvious counter is to increase these means, a step with which I have no argument.

– Special to the Press Herald