The TNTP Blog started an important conversation recently: are principals inflating teacher evaluation results? Indeed, a New York Times story earlier this year flagged a similar concern: some new evaluation systems, which were designed to counter the widget effect, are still rating very few teachers ineffective. As someone who has been evaluated by one of these innovative evaluation systems, I hope to offer a few potential—and admittedly anecdotal—solutions to this challenge.

During my first year teaching in Harrison School District 2 in Colorado Springs, the district had just implemented a teacher evaluation system that was a reformer’s dream, which then-superintendent Mike Miles, who now heads Dallas schools, touted in a policy paper for The Fordham Institute. Half of a teacher’s evaluation is based on his or her principal’s rating, including, for newer teachers, two sustained “formal” observations and at least sixteen short “spot” observations per year. The other half is based on student test scores.

Principals were heavily trained on the art of evaluation, so one would think and hope that the subjective assessment that teachers received would be fairly accurate. Not so, in many cases, as I soon found out.

I had significant behavior management problems during my first year. The good news for me, but bad news for a fair evaluation system, was that when the notoriously strict principal walked into my room to observe, wouldn’t you know it, the behavior problems magically disappeared. Students who had been off-task were suddenly paying rapt attention to the most boring topics. It would not be an exaggeration to say that my best lessons of the year were the ones during which I was evaluated.

This observer effect was not limited to students. When a principal was in the room, we teachers knew how to step up our game. This was doubly so when such observations were pre-planned, as were our formal observations, the most significant part of our evaluation. Teachers who regularly passed out worksheets and sat at their desks instead crafted elaborate lessons for these principal visits. I don’t think this says something bad about teachers, but perhaps just humans: anyone on a job does better work when he knows his boss is watching. Solving this problem might be difficult, but it should start with scrapping pre-planned observations; the conversation should also include using video cameras, much like the Measures of Effective Teaching project did.

A more subtle explanation for ratings inflation is the possibility that principals who conduct such reviews are not evaluating teachers in the abstract; instead they may be rating educators compared to the quality of a potential replacement teacher. Think about it this way: if a principal knows that an “ineffective” rating will lead to the teacher’s dismissal, the principal should—and likely will—only rate the teacher as such if she also thinks that she can hire a better replacement.

There were some teachers at my school who I did not think were particularly good, and whom I may not have wanted to teach my own (theoretical) children. (For what it’s worth, I didn’t count myself as all that good a teacher, particularly during my first year.) Yet, if I were the principal, I would not have dismissed many of the more pedestrian staff members for the sad reason that it might have been difficult to hire anyone better.

This suggests that to achieve their goals, reformers must pair high-quality evaluations with meaningful ideas on how to improve the quality of teacher preparation, as well as the talent pipeline. Thankfully, many are trying to do just that, but we must keep moving forward, while remembering the connection between different facets of reform.

As Matt DiCarlo at the Shanker Blog rightly points out, we don’t really know what percentage of teachers are ”good” or ”bad.” It’s certainly possible that a very high percentage of current teachers are quite effective. But to be certain, we need to ensure that evaluations are free of observer effects, and that principals have the opportunity to hire top-quality new teachers to replace those who are not truly effective.

Matt Barnum is a guest blogger. He previously taught eighth-grade language arts in Colorado, and writes regularly about education.

Imali Ariyarathne, seventh-grade teacher at Langston Hughes Academy, stands in front of her students while introducing them to the captivating world of science

