Updated 10/11/12. School District 65 administrators have prepared a revised “Teacher Professional Appraisal” policy that will be used to evaluate teachers starting this year. The District’s evaluation system will continue to use the “Danielson Model” to evaluate a teacher on subjective factors such as lesson planning, preparation, classroom environment, instruction, and professional responsibilities. The policy, however, substantially changes how teachers are evaluated using student growth.
“We’re trying to come up with a system that’s fair to teachers, but at the same time recognizes our responsibility to improve student learning outcomes,” Superintendent Hardy Murphy told the RoundTable. “That’s ultimately what we’re after here. At the end of the day this is about the students.”
Jean Luft, president of the District Educators Council (DEC, the teachers union), said members of DEC saw the draft of the new plan in early August and expressed some “serious concerns” about the plan, some of which have been addressed and some not. “Our teachers are not objecting to using student growth in the appraisal plan,” she said. “They just want an understandable plan that is consistent, reliable and valid and accurately monitors student achievement.”
Measuring Student Growth
Measuring student growth under the revised policy is much more complex than the prior policy. The prior policy essentially compared the percentage of students in a class scoring above the 50th percentile at the beginning of the school year with the percentage scoring above the 50th percentile at the end of the school year.
The Achievement Categories: The revised policy measures whether there is normal growth during a school year at four different achievement levels (“achievement categories”) and for the class overall: 1) college and career readiness, 2) grade level, or above the 50th percentile, 3) below grade level, between the 26th and the 49th percentiles, 4) lowest quartile, below the 25th percentile, and 5) the class overall.
These achievement categories align with the categories specified in the Board’s goals, Dr. Murphy told the RoundTable. Those goals call for increasing the percentage of students who are on track to college and career readiness and who are at grade, and decreasing the percentage in the bottom quartile.
By selecting achievement categories at different points along the distribution scale, “The growth of students across the distribution matters and is accounted for,” Dr. Murphy said. “We’re trying to distribute the energy of instruction across the distribution and make sure we’re taking care of kids at the top too.”
Use of Standardized and Other Tests: Student growth is measured using a standardized test, the Measures of Academic Progress (MAP), for math and reading at grades 3-5, and for math, reading and science in grades 6-8. Student growth is determined using projected growth targets established by MAP.
MAP defines its “growth targets” as the average growth students at a given point in the distribution scale achieve between the beginning and end of a school year. It is essentially the normal, expected growth during a school year.
For subjects not tested by MAP, the District will use a standardized test, ISEL, to measure growth at K-2 reading, and it will use locally-developed District assessments to measure growth for subjects including social studies, fine arts and foreign language. Student growth targets will be determined “using District norms.”
The new policy also allows for teacher-selected data, such as a portfolio of student work, publisher assessments, and teacher assessments, to be included in the mix if certain criteria are met.
Teacher Ratings: Teacher ratings are determined by comparing a) the percentage of students in a class who met/exceeded their projected growth targets, with b) a District-wide percentage range. This comparison is done for each achievement category and the class as a whole.
The District-wide percentage range is computed by determining the percentage of students District-wide who met/exceeded their projected growth targets in the previous year and then expanding that percentage to include a range of percentages using a “confidence interval,” equal to the standard error of measurement.
In practical terms, if 65% of students District-wide met/exceeded their projected growth targets in the previous year, the District-wide range would be 62% to 68%.
Teachers are then rated as excellent, proficient, needs improvement, or unsatisfactory as follows:
• Excellent – if the percentage of students in a class who met or exceeded their projected growth targets in “most” student achievement categories exceeds the District-wide range, and the percentage is not lower in any category than the District-wide range.
• Proficient – if there is an increase in one achievement category, and a decline in no more than one category.
• Needs Improvement – if there is an increase in no achievement category, and most categories do not show a decline; or there’s a decline in one, but not most of the categories.
Unsatisfactory – if there is a decline in most achievement categories.
The policy provides that a “rule-of-threes” will be used. If the presence of fewer than three students in an achievement category negatively impacts a teacher’s rating, that growth category will not be used in the evaluation. This adds “tolerance” to ratings, Dr. Murphy told the RoundTable.
Summative Ratings of Teachers: A summative rating is then determined by combining a teacher’s rating using the Danielson Model and the student growth rating.
DEC’s Concerns/District Responses
Ms. Luft told the RoundTable that teachers “still have many questions and concerns and a high level of confusion about the system.” She identified some of the concerns.
Because teachers will be evaluated based on the percentage of students who are meeting growth targets in four achievement categories, the evaluations will be based on a relatively small groups of students. The rule-of-threes addresses this concern if there are fewer than three students in an achievement category. Ms. Luft said, though, teachers are concerned that if one student in a group of three, four, five or six does poorly on a test because of sickness, sloppiness, distractions or to race through and finish the test, it will skew the results.
In addition, if an achievement category has five students, 80% of the students will need to meet their normal growth target to meet a District-wide percentage range of 62% to 68%. This may also skew the results.
Lora Taira, chief information officer for District 65, told the RoundTable that if a teacher thought there was an inconsistency between a student’s score and his or her real growth, the teacher could present teacher-selected data or the student’s portfolio of work to show a year’s growth.
Dr. Murphy said another way this issue could be addressed is to expand the rule-of-threes to a rule-of-fours. This would build in additional flexibility into the system, he said.
Ms. Luft said there is also a concern about whether District-created assessments (used to assess teachers of subjects not covered by MAP or ISEL) will be valid and reliable indicators to measure student growth, because they were not prepared by skilled test-makers or psychometricians. This accounts for about 30% of the teacher appraisals, she said.
Ms. Taira said most of the District-created assessments were created by teachers four years ago, were used in the prior teacher appraisal system, and have been tweaked “to make sure we have as valid an assessment as we can.” She said growth targets would be created using the same methodology MAP used in developing its growth targets.
Another concern is that teachers may be rated as “needs improvement” or “unsatisfactory” based on data from one school year. Ms. Luft said the Value-Added Research Center at the University of Wisconsin suggests using multiple years. A rating of “needs improvement” could be a career-ending decision for a teacher, she said.
Dr. Murphy said that student growth is one part of a teacher’s summative rating, and that subjective factors under Danielson would also be considered.
DEC also suggested that the model be implemented as a “pilot” or as a “shadow system,” said Ms. Luft. A shadow system would “ensure that the teachers and principals fully understand the system throughout the year before it becomes a requirement.”
“There’s no reason to shadow,” said Dr. Murphy. “The responsibility for us is to come up with a model that’s workable and that respects and acknowledges teacher efforts, and especially teacher excellence, but at the same time it’s got to be a model that responds to the need for improvement in the percentage of students meeting and exceeding their growth targets.
“There’s not going to be a perfect system,” said Dr. Murphy. “At the end of the day what you’re trying to do is build enough tolerances in the system so when you have a marginal difference that makes a substantive change in a rating, that you have procedures built into your system to resolve that to make sure that your summative rating is a fair and accurate one.”
Administrators and DEC have each been running their own numbers to project how many teachers will be rated excellent, proficient, needs improvement and unsatisfactory under the new system.
A joint committee of administrators and teachers met yesterday to discuss the new teacher appraisal system. Dr. Murphy told the RoundTable he committed to “create new tolerances in the system,” including by expanding the rule-of-threes to a rule-of-fives. Thus, if the presence of fewer than five students in an achievement category negatively impacts a teacher’s rating, that growth category will not be used in the evaluation.
DEC continues to have many serious concerns, Ms. Luft told the RoundTable.