But if we all agree that it's insane to
measure teachers based on test scores alone, why should we keep doing that for schools?
Not exact matches
This is the basic
measure of value - added assessment in use today;
teachers in many states across the country are evaluated (and sometimes compensated or fired)
based on similar
measures.
The outcomes were
measured by a global hyperactivity aggregate (GHA), scores
based on parent and
teacher observations, and for 8 and 9 year olds, a computerized attention test.
Low family income during the early childhood has been linked to comparatively less secure attachment, 4 higher levels of negative moods and inattention, 5 as well as lower levels of prosocial behaviour in children.2 The link between low family income and young children's problem behaviour has been replicated across several datasets with different outcome
measures, including parental reports of externalizing and internalizing behaviours,1 - 3, 7 -9,11-12
teacher reports of preschool behavioural problems, 10 and assessments of children
based on clinical diagnostic interviews.7
Republicans in the state Senate have introduced a «same as» bill that would decouple state -
based standardized examinations from
teacher and principle evaluations — suggesting the
measure strongly backed by the state's
teachers union stands a strong chance of advancing in Albany.
Lawmakers last year agreed to linking Common Core -
based testing to the results of
teacher performance evaluations, a
measure that was sought by Gov. Andrew Cuomo and linked to a boost in school aid.
Teachers wouldn't be evaluated
based on their students» standardized test scores any longer under a
measure approved by the New York State Assembly.
Cuomo wants to change the way
teacher evaluations are
measured, he'd like to see at least half of a
teacher's grade be
based on standardized tests associated with the Common Core curriculum.
Another Marcellino - backed
measure would reform Common Core -
based tests by providing test answers and questions to
teachers.
Teachers wouldn't be evaluated
based on their students» standardized test scores any longer under a
measure approved by the New York state Assembly.
Charter school leader Deborah Kenny's op - ed in today's The New York Times argues against the move by many states toward
teacher evaluations
based on multiple
measures, including both student progress on achievement tests and the reviews of principals.
The Yonkers Democrat vowed to back an increase in education spending and touted her conference's opposition to education policy
measures that sought to link
teacher evaluations to Common Core -
based test results.
They implemented a rigorous
teacher evaluation system,
based on multiple
measures of performance.
Teachers welcome evaluation, he said, «but those evaluations should be fair and meaningful and help them improve, not a «gotcha» system that's
based on unreliable, invalid and inaccurate
measures.»
Adding to a system that includes ELA and Math tests from 3rd to 8th grade, the New York State Report Card and AYP ratings (Adequate Yearly Progress), New York State is incorporating the new Annual Professional Performance Review or «APPR» which
measures teacher performance
based, in part, on standardized state tests.
Under the proposal,
teacher evaluations would be
based on both objective
measures, like student performance on state tests, and subjective
measures like «rigorous» classroom observation.
Whatever the parties negotiate or King decides, the evaluation system will be
based 20 percent on standardized test scores when applicable, 20 percent on other evidence of student learning and 60 percent on classroom observation and other
measures of
teacher effectiveness, in keeping with the 2010 state law on
teacher evaluation.
The new evaluation system will provide clear standards and significant guidance to local school districts for implementation of
teacher evaluations
based on multiple
measures of performance including student achievement and rigorous classroom observations.
BOX 14, I -1-4; 30188578 / 734260 Slides Plus Audiotape - SAPA II, Orientation Filmstips, AAAS, «The Integrated Process», Filmstrip 4, 1974 SAPA II, Orientation Filmstrips, AAAS, «
Measuring», Filmstrip 3, 1974 Plus Audiotape - SAPA II, Orientation Filmstrips, AAAS, «Teaching Strategies», Filmstrip 3, 1974 Plus Transcript of orientation tape - SAPA II, Orientation Filmstrips, AAAS, «The Basic Processes of Science», Filmstrip 2, 1974 «Laboratory Exercises for Use in a College Science Course for Non-Science Majors» - by James Wallace Cox, 1970 «A Process Approach to Learning, Supplementary Manual»,
based on SAPA developed by AAAS, by Ruth M. White, 1970 «Science Process Instrument, Experimental Edition», COSE, 1970 «Preservice Science Education of Elementary School
Teachers - Guidelines, Standards and Recommendations for Research and Development» report, Feb. 1969 (4 Folders) «Preservice Science Education of Elementary School
Teachers - Preliminary Report», Feb. 1969 «An Evaluation of Elementary Science Study as SAPA» by Robert B. Nicodemus, Sept. 1968 «SAPA - Purposes, Accomplishments, Expectations», COSE, AAAS (Brochure reported in Nov. 1968, 1970), 1967 (3 Folders) «The Psychological
Bases of SAPA», COSE, 1965 «Guidelines and Standards for the Education of Secondary School
Teachers of Sciecne and Mathematics» bookley, AAAS and the National Association of State Directors of
Teacher Education and Certification «Career Opportunites in the Sciences» brochure, compiled by the Office of Opportunites in Science Slides and documentation - «Animal Eyes» and «Meterological Instruments», Fernbank Science Center, «An Integral Part of the DeKalb County School System» Slides and documentation - «Building Terrariums» and «What is my Age?»
The assessment materials, used to
measure students» understanding of the sciences in middle and early high school, will be amplified by Naiku, a Minnesota -
based company whose assessment platform reaches
teachers around the country, and a Canadian consortium that includes McGill University in Montreal.
BOX 4, Q -1-3 SAPA Experimental Edition, Part 3 (prepared for testing in the early grades), 1963 Experimental Edition, Part 4 (prepared for testing in the early grades), 1963 2nd Experimental Edition, Part 2 (prepared for testing in elementary schools), 1964 2nd Experimental Edition, Part 3 (prepared for testing in elementary schools), 1964 3rd Experimental Edition (prepared for testing in elementary schools), 1965 Parts 1 and 1B Part 2 Part 3 Part 4A Competency
Measures, Parts 3 and 4, 1965 The Psychological
Bases of SAPA, COSE, 1965 Commentary for
Teachers (prepared for testing in elementary schools), 1965 3rd Experimental Edition, (prepared for testing in elementary schools), 1965 Parts 5A and 5B Part 6B Part 7A Part 7B 3rd Experimental Edition, 1st Revision (prepared for testing in elementary schools), 1966 Part 5 Part 5 Part 6 Test Sheets, Parts 5 - 7, 1966 4th Experimental Edition, Part 6, (prepared for testing in elementary schools), 1967 4th Experimental Edition, Part 7, (prepared for testing in elementary schools), 1967 Guide for Inservice Instruction, Response Sheets, 1967 3rd Experimental Edition, Commentary for
Teachers, 1968 Guide to Inservice Instruction, Supplement, 1969
BOX 23, A-15-4; 30219212 / 734979 SAPA Requests for Translations of SAPA materials, 1966 - 1968 Prerequisites for SAPA The Psychological
Basis of SAPA, 1965 Requests for SAPA to be Used in Canada, 1966 - 1968 Requests for Assistance with Inservice programs, 1967 - 1968 Schools Using SAPA, 1966 - 1968 Speakers on SAPA for NSTA and Other Meetings, 1968 Suggestions for Revisions of Part 4, 1967 - 1968 Suggestions for Revisions of the Commentary, 1967 - 1968 Summer Institutes for SAPA, Locations, 1968 Summer Institutes for SAPA, Announcement Forms, 1968 Inservice Programs, 1968 - 1969 Consultant Recommendations, 1967 - 1968 Inquiries About Films, 1968 Inquiries About Kits, 1967 - 1968 Inquiries About Evaluations, 1968 Tryout
Teacher List, 1967 - 1968 Tryout Centers, 1967 - 1968 Tryout Feedback Forms, 1967 - 1968 Tryout Center Coordinators, 1967 - 1968 Cancelled Tryout Centers, 1967 - 1968 Volunteer
Teachers for Parts F & G, 1967 - 1968 List of
Teachers for Tryout Centers, 1963 - 1966 Tucson, AZ, Dr. Ed McCullough, 1964 - 1968 Tallahassee, FL, Mr. VanPierce, 1964 - 1968 Chicago, IL, University of Chicago, Miss Illa Podendorf, 1965 - 1969 Monmouth, IL, Professor David Allison, 1964 - 1968 Overland Park, KS, Mr. R. Scott Irwin and Mrs. John Muller, 1964 - 1968 Baltimore, MD, Mr. Daniel Rochowiak, 1964 - 1968 Kern County, CA, Mr. Dale Easter and Mr. Edward Price, 1964 - 1967 Philadelphia, PA, Mrs. Margaret Efraemson, 1968 Austin, TX, Dr. David Butts, 1968 Seattle, WA, Mrs. Louisa Crook, 1968 Oshkosh, WI, Dr. Robert White, 1968 John R. Mayer, personal correspondence, 1966 - 1969
Teacher Response Sheets, 1966 - 1967 Overland, KS Oshkosh, WI Monmouth, IL Baltimore, MD
Teacher Response Checklist SAPA Feedback, 1965 - 1966 Using Time Space Relations Communicating Observing Formulating Models Defining Operationally Interpreting Data Classifying (2 Folders)
Measuring Inferring Predicting Formulating Hypothesis Controlling Variables Experimenting Using Numbers SAPA Response Sheets for Competency
Measures, 1966
Most of us would like to think we are doing the best to stay healthy as individuals, but some of the most effective preventative
measures are initiated at a national level by government —
based on the best available evidence and research — and need to be taken up by all sectors of society including
teachers, employers, designers, and businesses.
Sometimes, researchers
measured teacher success
based on the observation of classroom supervisors.
Education took center stage in Iowa's 2006 legislative session, resulting in
measures to boost
teacher salaries, start a pilot program that
bases teacher pay on student achievement, expand preschool, and establish statewide graduation requirements.
It would seem that the ongoing discussions about «
teacher effectiveness» and the creation of evaluation systems focused on
measuring a
teacher's capacity (increasingly
based on test scores) often do very little to actually develop that capacity.
But, as numerous studies have shown, having a master's degree is generally not correlated with
measures of
teacher effectiveness,
based on student test scores.
Opting out adds noise to the data, which increases the amount of variability in the
teacher performance
measures because each
teacher's score is
based on fewer students.
My colleague Katharine Lindquist and I used statewide data from North Carolina to simulate the impact of opt - out on test - score -
based measures of
teacher performance.
Given what we have learned, one wonders whether there would have been more consensus by now on the appropriate use of test -
based measures in
teacher evaluation if the debate had not started out so polarized.
For a number of reasons — limited reliability, the potential for abuse, the recent evidence that
teachers have effects on student earnings and college going which are largely not captured by test -
based measures — it would not make sense to attach 100 percent of the weight to test -
based measures (or any of the available
measures, including classroom observations, for that matter).
I do not disagree with the message about our importance, what I disagree with is the ability to quantitatively
measure that impact
based solely on a
teacher's performance.
A
teacher's contribution to a school's community, as assessed by the principal, was worth 10 percent of the overall evaluation score, while the final 5 percent was
based on a
measure of the value - added to student achievement for the school as a whole.
After extensive research on
teacher evaluation procedures, the
Measures of Effective Teaching Project mentions three different measures to provide teachers with feedback for growth: (1) classroom observations by peer - colleagues using validated scales such as the Framework for Teaching or the Classroom Assessment Scoring System, further described in Gathering Feedback for Teaching (PDF) and Learning About Teaching (PDF), (2) student evaluations using the Tripod survey developed by Ron Ferguson from Harvard, which measures students» perceptions of teachers» ability to care, control, clarify, challenge, captivate, confer, and consolidate, and (3) growth in student learning based on standardized test scores over multipl
Measures of Effective Teaching Project mentions three different
measures to provide teachers with feedback for growth: (1) classroom observations by peer - colleagues using validated scales such as the Framework for Teaching or the Classroom Assessment Scoring System, further described in Gathering Feedback for Teaching (PDF) and Learning About Teaching (PDF), (2) student evaluations using the Tripod survey developed by Ron Ferguson from Harvard, which measures students» perceptions of teachers» ability to care, control, clarify, challenge, captivate, confer, and consolidate, and (3) growth in student learning based on standardized test scores over multipl
measures to provide
teachers with feedback for growth: (1) classroom observations by peer - colleagues using validated scales such as the Framework for Teaching or the Classroom Assessment Scoring System, further described in Gathering Feedback for Teaching (PDF) and Learning About Teaching (PDF), (2) student evaluations using the Tripod survey developed by Ron Ferguson from Harvard, which
measures students» perceptions of teachers» ability to care, control, clarify, challenge, captivate, confer, and consolidate, and (3) growth in student learning based on standardized test scores over multipl
measures students» perceptions of
teachers» ability to care, control, clarify, challenge, captivate, confer, and consolidate, and (3) growth in student learning
based on standardized test scores over multiple years.
Under IMPACT, all
teachers receive a single score ranging from 100 to 400 points at the end of each school year
based on classroom observations,
measures of student learning, and commitment to the school community.
In response to the criticism that
teacher impacts on student test scores are inconsistent over time, the authors show that «although VA
measures fluctuate across years, they are sufficiently stable» that selecting
teachers even
based on a few years of data would have substantial impacts on student outcomes, such as earnings.
That system is
based on a variety of
measures: results from
teacher - certification tests; graduates» ratings of their satisfaction with their programs; and the ratings of graduates» mentor
teachers on the quality of the programs in preparing novices according to state standards for
teachers.
As importantly, it appears that existing survey -
based measures of non-cognitive skills, although perhaps useful for making comparisons among students within the same educational environment, are inadequate to gauge the effectiveness of schools,
teachers, or interventions in cultivating the development of those skills.
To the extent the program involves student achievement, it
bases awards on «student learning objectives» as «created by individual
teachers, with the approval of site -
based administrators»; these objectives «will be
measured by a combination of existing assessment instruments, and
teacher designed tools,» as well as by state standardized tests.
The database does not include a direct
measure of a
teacher's seniority in the current district, so we estimate seniority
based on how many years the
teacher has been employed by the same district.
On the
basis of these survey results, we created three
measures: (1) the principal's overall assessment of the
teacher's effectiveness, which is a single item from the survey; (2) the
teacher's ability to improve student academic performance, which is a simple average of the organization, classroom management, reading achievement, and math achievement survey items; and (3) the
teacher's ability to increase student satisfaction, which is a simple average of the role model and student satisfaction survey items.
In 2002 and 2003 no single choice received more than half of the responses, but the fact that fewer than half of the
teachers surveyed selected the first choice, none, is remarkable; it means that for two years» running more than half of the union members surveyed believe that some portion of their pay should be
based on accurately
measured student growth.
Importantly, those
teachers whose scores were determined (at least in part) on the
basis of empirical
measures of student growth had more score variation (54 percent receiving «Exceeds») than those assessed via growth goals
based on professional standards (69 percent receiving «Exceeds»).
Mostly
based on «value added,» a statistical
measure of the contribution the
teachers make to student achievement on standardized tests.
The same stance characterized the Gates Foundation's
Measures of Effective Teaching report last winter, with its effort to gauge the utility of various teacher evaluation strategies (student feedback, observation, etc.) based upon how closely they approximated value - added m
Measures of Effective Teaching report last winter, with its effort to gauge the utility of various
teacher evaluation strategies (student feedback, observation, etc.)
based upon how closely they approximated value - added
measuresmeasures.
By way of comparison, the authors note that the impact of being assigned to a
teacher in the top - quartile rather than one in the bottom quartile in terms of their total effect on student achievement as
measured by student - test -
based measures of
teacher effectiveness is seven percentile points in reading and six points in math.
Finally and most significantly, Tennessee's RTTT package requires that
measured student achievement comprise at least 50 % (35 %
based on TVAAS gains, where available) of
teacher and principal performance assessments.
Even if we accept the dubious proposition that 200,000 studies provide a scientific
basis for the authors» 13 nebulous standards of good
teacher practice, we can't be sure that the ways in which the authors have chosen to
measure these standards necessarily replicate those of the underlying studies.
In Table 1 of the technical report (on which Jay
bases his critique), the MET team uses evaluation
measures from 2009 - 10 to test their ability to «post-dict»
teachers» effectiveness the previous year.
Those who want to reward
teachers on the
basis of
measured performance should consider whether it is worth the trouble and expense to implement value - added assessment if the only outcome is to reward small numbers of
teachers.