For example, different statistical models (all based on reasonable assumptions) yield
different effectiveness scores.
Not exact matches
DelPlano, however, conceded the idea of placing heavy weight on test
scores to measure her
effectiveness «makes me nervous» because there are variables in
different classrooms across New Jersey.
This kind of analysis is similar to what is being demanded to assess teacher
effectiveness at the city, state, and federal levels: comparing test
scores on two
different dates to see change over time.
In DC, IMPACT asks evaluators to consider a teachers» expertise in a number of
different domains, using test
scores as one indicator of teacher
effectiveness.
This is particularly important as illustrated in the prior post (Footnote 8 of the full piece to be exact), because «Teacher
effectiveness ratings were based on, in order of importance by the proportion of weight assigned to each indicator [including first and foremost]: (1)
scores derived via [this] district - created and purportedly «rigorous» (Dee & Wyckoff, 2013, p. 5) yet invalid (i.e., not having been validated) observational instrument with which teachers are observed five times per year by
different folks, but about which no psychometric data were made available (e.g., Kappa statistics to test for inter-rater consistencies among
scores).»
Thus we have good reason to suspect that school
effectiveness biases comparisons of the value - added
scores of teachers working in
different schools.
If value - added
scores from
different tests lead to
different conclusions about a teacher, then we may worry that value - added from any single test provides an incomplete picture of a teacher's
effectiveness, and that using it to make decisions about teachers may be inefficient or, for some teachers, unfair.