You are here
Topic Areas
Teacher quality
Full Abstract
There is increased policy interest in extending test-based evaluations in K-12 education to include student achievement in high school. High school achievement is typically measured by performance on end-of-course exams (EOCs), which test course-specific standards in a variety of subjects. However, unlike standardized tests in the early grades, students take EOCs at different points in their schooling careers. The timing of the test is a choice variable presumably determined by input from administrators, students and parents. Recent research indicates that school and district policies that determine when students take particular courses can have important consequences for achievement and subsequent outcomes like advanced course taking. We develop an approach for modeling EOC test performance that disentangles the influence of school and district policies regarding the timing of course taking from other factors. After separating out the timing issue, better measures of the quality of instruction provided by districts, schools and teachers can be obtained. Our approach also offers diagnostic value because it separates out the influence of school and district course-timing policies from other factors that determine student achievement.
Citation: Eric Parsons, Cory Koedel, Michael Podgursky, Mark Ehlert , P. Brett Xiang (2015). Incorporating End-of-Course Exam Timing into Educational Performance Evaluations. CALDER Working Paper No. 137
Full Abstract
Teacher and principal evaluation systems now emerging in response to federal, state and/or local policy initiatives typically require that a component of teacher evaluation be based on multiple performance metrics, which must be combined to produce summative ratings of teacher effectiveness. Districts have utilized three common approaches to combine these multiple performance measures, all of which introduce bias and/or additional prediction error that was not present in the performance measures originally. This paper investigates whether the bias and error introduced by these approaches erodes the ability of evaluation systems to reliably identify high- and low-performing teachers. The analysis compares the expected differences in long-term teacher value-added among teachers identified as high- or low-performing under these three approaches, using simulated data based on estimated inter-correlations and reliability of measures in the Gates Foundation’s Measures of Effective Teaching project. Based on the results of our simulation exercise presented here, we conclude these approaches can undermine the evaluation system’s objectives in some contexts. Depending on the way these performance measures are actually combined to categorize teacher performance, the additional error and bias can be large enough to undermine the district’s objectives.
Citation: Michael Hansen, Mariann Lemke, Nicholas Sorensen (2014). Combining Multiple Performance Measures: Do Common Approaches Undermine Districts’ Personnel Evaluation Systems?. CALDER Working Paper No. 118
Full Abstract
Teacher pension systems target retirements within a narrow range of the career cycle by penalizing individuals who separate too soon or remain employed too long. The penalties result in the retention of some teachers who would otherwise choose to leave, and the premature exit of some teachers who would otherwise choose to stay. We examine how the effects of teachers' pension incentives on workforce composition influence teacher quality. Teachers who are held in by the "pull" incentives in the pension systems are not more effective, on average, than the typical teacher. Teachers who are encouraged to exit by the "push" incentives are more effective on average. We conclude that the net effect of teachers' pension incentives on workforce quality is small, but negative. Given the substantial and growing costs of current systems, and the lack of evidence regarding their efficacy, experimentation by traditional and charter schools with alternative retirement benefit structures would be useful.
Citation: Cory Koedel, Michael Podgursky (2014). Teacher Pension Systems, the Composition of the Teaching Workforce, and Teacher Quality. CALDER Working Paper No. 72
Full Abstract
This paper examines the value of strategically assigning disproportionately larger classes to the strongest teachers in order to optimize student learning in the face of differential teacher effectiveness. The rationale is straightforward: Larger classes for the best teachers benefit the pupils who are reassigned to them; they also help the less effective teachers improve their instruction by enabling them to concentrate on fewer students. But just how much of a difference could manipulating class sizes in this way make for overall student learning and access to effective teaching? This study performs a simulation based on North Carolina data to estimate plausible student outcomes under this approach. In the North Carolina data, I find there is a very slight tendency to place more students in the classes of effective teachers; but still only about 25 percent of students are taught by the top 25 percent of teachers. Intensively reallocating eighth-grade students—so that the most effective teachers have up to twelve more pupils than the average classroom—may produce gains equivalent to adding roughly two-and-a-half extra weeks of school. Even adding a handful of students to the most effective eighth-grade teachers (up to six more than the school’s average) produces gains in math and science akin to extending the school year by nearly two weeks or, equivalently, to removing the lowest 5 percent of teachers from the classroom. The potential impacts on learning are more modest in fifth grade, where the large majority of teachers are in self-contained classrooms. Results show that this strategy shows an overall improvement in student access to effective teaching, yet gaps in access for economically disadvantaged students persist. For instance, disadvantaged eighth-grade students are about 8 percent less likely than non-disadvantaged peers to be assigned to a teacher in the top 25 percent of performance. This gap in access changes little in spite of the policy putting more students in front of effective teachers — because the pool of available teachers in high-poverty schools does not change under this strategy. Thus, this policy alone shows little promise in reducing achievement gaps.
Citation: Michael Hansen (2014). Right-Sizing the Classroom: Making the Most of Great Teachers. CALDER Working Paper No. 110
Full Abstract
We examine the efficiency implications of imposing proportionality in teacher evaluation systems. Proportional evaluations force comparisons to be between equally-circumstanced teachers. We contrast proportional evaluations with global evaluations, which compare teachers to each other regardless of teaching circumstance. We consider a policy where administrators use the ratings from the evaluation system to help shape the teaching workforce, and define efficiency in terms of student achievement. Our analysis indicates that proportionality can be imposed in teacher evaluation systems without efficiency costs under a wide range of evaluation and estimation conditions. Proportionality is efficiency-enhancing in some cases. These findings are notable given that proportional teacher evaluations offer a number of other policy benefits.
Citation: Cory Koedel, Jiaxi Li (2014). The Efficiency Implications of Using Proportional Evaluations to Shape the Teaching Workforce. CALDER Working Paper No. 106
Full Abstract
Measures of teachers’ “value added” to student achievement play an increasingly central role in k-12 teacher policy and practice, in part because they have been shown to predict teachers’ long-term impacts on students’ life outcomes. However, little research has examined variation in the long-term effects of teachers with similar value-added performance. In this study, we investigate variation in the persistence of teachers’ value-added effects on student achievement in New York City. We separate persistent effects into general effects that improve both the subject taught (math or English language arts (ELA)) and the other area of measured achievement and subject-specific effects which improve only the subject taught. Two findings emerge. First, a teacher’s value-added to ELA achievement has substantial crossover effects on long-term math performance. That is, having a better ELA teacher affects both math and ELA performance in a future year. Conversely, math teachers have only minimal long-term effects on ELA performance; their effects are far more subject-specific. Second, we identify substantial heterogeneity in the persistence of English Language Arts (ELA) teachers’ effects across observable student, teacher, and school characteristics. In particular, teachers in schools serving more poor, minority, and previously low-scoring students have less persistence than other teachers with the same value-added scores. Moreover, ELA teachers with stronger academic backgrounds have more persistent effects on student achievement, as do schools staffed with a higher proportion of such teachers. The results indicate that teachers’ effects on students’ long-term skills can vary as a function of instructional content and quality in ways that are not fully captured by value-added measures of teacher effectiveness.
Citation: Ben Master, Susanna Loeb, James Wyckoff (2014). Learning that Lasts: Unpacking Variation in Teachers’ Effects on Students’ Long-Term Knowledge. CALDER Working Paper No. 104
Full Abstract
We use a unique longitudinal sample of student teachers (“interns”) from six Washington state teacher training institutions to investigate patterns of entry into the teaching workforce. Specifically, we estimate split population models that simultaneously estimate the impact of individual characteristics and student teaching experiences on the timing and probability of initial hiring as a public school teacher. Not surprisingly, we find that interns endorsed to teach in “difficult-to-staff” areas are more likely to be hired as teachers than interns endorsed in other areas. Younger interns, white interns, and interns who did their student teaching in suburban schools are also more likely to find a teaching job. Prospective teachers who do their internships at schools that have more teacher turnover are more likely to find employment, often at those schools. Finally, interns with higher credential exam scores are more likely to be hired by the school where they did their student teaching. Contrary to expectations, few of the measures of the quality or the experience of an intern’s cooperating teacher are predictive of workforce entry in the expected direction.
Citation: Dan Goldhaber, John Krieg, Roddy Theobald (2013). Knocking on the Door to the Teaching Profession? Modeling the Entry of Prospective Teachers into the Workforce. CALDER Working Paper No. 105
Full Abstract
Teachers in the United States are compensated largely on the basis of fixed schedules that reward experience and credentials. However, there is a growing interest in whether performance-based incentives based on rigorous teacher evaluations can improve teacher retention and performance. The evidence available to date has been mixed at best. This study presents novel evidence on this topic based on IMPACT, the controversial teacher-evaluation system introduced in the District of Columbia Public Schools by then-Chancellor Michelle Rhee. IMPACT implemented uniquely high-powered incentives linked to multiple measures of teacher performance (i.e., several structured observational measures as well as test performance). We present regression-discontinuity (RD) estimates that compare the retention and performance outcomes among low-performing teachers whose ratings placed them near the threshold that implied a strong dismissal threat. We also compare outcomes among high-performing teachers whose rating placed them near a threshold that implied an unusually large financial incentive. Our RD results indicate that dismissal threats increased the voluntary attrition of low-performing teachers by 11 percentage points (i.e., more than 50 percent) and improved the performance of teachers who remained by 0.27 of a teacher-level standard deviation. We also find evidence that financial incentives further improved the performance of high-performing teachers (effect size = 0.24).
Citation: Thomas Dee, James Wyckoff (2013). Incentives, Selection, and Teacher Performance: Evidence from IMPACT. CALDER Working Paper No. 102
Full Abstract
This study explores whether teacher performance trajectory over time differs by school poverty settings. Focusing on elementary school mathematics teachers in North Carolina and Florida, we find no systematic relationship between school student poverty rates and teacher performance trajectories. In both high (>=60% FRL) and lower-poverty (<60% FRL) schools, teacher performance improves the fastest in the first five years and then flattens out in years five to ten. Teacher performance growth resumes between year ten and 15 in North Carolina but remains flat in Florida. In both school poverty settings, there is significant variation in teacher performance trajectories. At all career stages, the fastest-growing teachers (75th percentile) improve by .02-.04 standard deviations more in student gain scores annually than slower teachers (25th percentile). Our findings suggest that the lack of productivity “return” to experience in high-poverty schools reported in the literature is unlikely to be the result of differential teacher learning in high and lower-poverty schools.
Citation: Zeyu Xu, Umut Özek, Michael Hansen (2013). Teacher Performance Trajectories in High and Lower-Poverty Schools. CALDER Working Paper No. 101
Full Abstract
The specifics of how growth models should be constructed and used to evaluate schools and teachers is a topic of lively policy debate in states and school districts nationwide. In this paper we take up the question of model choice and examine three competing approaches. The first approach, reflected in the popular student growth percentiles (SGPs) framework, eschews all controls for student covariates and schooling environments. The second approach, typically associated with value-added models (VAMs), controls for student background characteristics and under some conditions can be used to identify the causal effects of schools and teachers. The third approach, also VAM-based, fully levels the playing field so that the correlation between school- and teacher-level growth measures and student demographics is essentially zero. We argue that the third approach is the most desirable for use in educational evaluation systems. Our case rests on personnel economics, incentive-design theory, and the potential role that growth measures can play in improving instruction in K-12 schools.
Citation: Mark Ehlert, Cory Koedel, Eric Parsons, Michael Podgursky (2013). Selecting Growth Models for School and Teacher Evaluations: Should Proportionality Matter?. CALDER Working Paper No. 80
Full Abstract
There is increasing agreement among researchers and policymakers that teachers vary widely in their ability to improve student achievement, and the difference between effective and ineffective teachers has substantial effects on standardized test outcomes as well as later life outcomes. However, there is not similar agreement about how to improve teacher effectiveness. Several research studies confirm that on average novice teachers show remarkable improvement in effectiveness over the first five years of their careers. In this paper we employ rich data from New York City to explore the variation among teachers in early career returns to experience. Our goal is to better understand the extent to which measures of teacher effectiveness during the first two years reliably predicts future performance. Our findings suggest that early career returns to experience may provide useful insights regarding future performance and offer opportunities to better understand how to improve teacher effectiveness. We present evidence not only about the predictive power of early value-added scores, but also on the limitations and imprecision of those predictions.
Citation: Allison Atteberry, Susanna Loeb, James Wyckoff (2013). Do First Impressions Matter? Improvement in Early Career Teacher Effectiveness. CALDER Working Paper No. 90
Full Abstract
Redistributing highly effective teachers from low- to high-need schools is an education policy tool that is at the center of several major current policy initiatives. The underlying assumption is that teacher productivity is portable across different schools settings. Using elementary and secondary school data from North Carolina and Florida, this paper investigates the validity of this assumption. Among teachers who switched between schools with substantially different poverty levels or academic performance levels, we find no change in those teachers’ measured effectiveness before and after a school change. This pattern holds regardless of the direction of the school change. We also find that high-performing teachers’ value-added dropped and low-performing teachers’ value-added gained in the post-move years, primarily as a result of regression to the within-teacher mean and unrelated to school setting changes. Despite such shrinkages, high-performing teachers in the pre-move years still outperformed low-performing teachers after moving to schools with different settings.
Citation: Zeyu Xu, Umut Özek, Matthew Corritore (2012). Portability of Teacher Effectiveness Across Schools. CALDER Working Paper No. 77
Full Abstract
In this paper we report on work estimating the stability of value-added estimates of teacher effects, an important area of investigation given public interest in workforce policies that implicitly assume effectiveness is a stable attribute within teachers. The results strongly reject the hypothesis that teacher performance is completely stable within teachers over long periods of time, but estimates suggest that a component of performance appears to persist within teachers, even over a ten-year panel. We also find that little of the changes in teacher effectiveness estimates within teachers can be explained by observable characteristics.
Citation: Dan Goldhaber, Michael Hansen (2012). Is it Just a Bad Class? Assessing the Long-term Stability of Estimated Teacher Performance. CALDER Working Paper No. 73
Full Abstract
In this paper we consider the challenges involved in evaluating teacher preparation programs when controlling for school contextual bias. Including school fixed effects in the achievement models used to estimate preparation program effects controls for school environment by relying on differences among student outcomes within the same schools to identify the program effects. However, identification of preparation program effects using school fixed effects requires teachers from different programs to teach in the same school. Even if program effects are identified, the precision of the estimated effects will depend on the degree to which graduates from different programs overlap across schools. In addition, if the connections between preparation programs result from the overlap of atypical graduates or from graduates teaching in atypical school environments, use of school effects could produce bias. Using statewide data from Florida, we show that teachers tend to teach in schools near the programs in which they received their training, but there is still sufficient overlap across schools to identify preparation program effects. We show that the ranking of preparation programs varies significantly depending on whether or not school environment is taken into account via school fixed effects. We find that schools and teachers that are integral to connecting preparation programs are atypical, with disproportionately high percentages of Hispanic teachers and students compared to the state averages. Finally, we find significant variance inflation in the estimated program effects when controlling for school fixed effects, and that the size of the variance inflation factor depends crucially on the length of the window used to compare graduates teaching in the same schools.
Citation: Kata Mihaly, Daniel McCaffery, Tim Sass, J.R. Lockwood (2012). Where You Come From or Where You Go? Distinguishing Between School Quality and the Effectiveness of Teacher Preparation Program Graduates. CALDER Working Paper No. 63
Full Abstract
This study seeks to identify the characteristics and training experiences of teachers who are differentially effective at promoting academic achievement among English language learners (ELLs). Our analyses indicate that general skills such as those reflected by scores on teacher certification exams and experience teaching non-ELL students are less predictive of achievement for ELL students than for other students. However, specific experience teaching ELL students is more important for predicting effectiveness with future ELL students than non-ELL students as is both in-service and pre-service training focused on ELL-specific instructional strategies.
Citation: Ben Master, Susanna Loeb, Camille Whitney, James Wyckoff (2012). Different Skills: Identifying Differentially Effective Teachers of English Language Learners. CALDER Working Paper No. 68
Full Abstract
In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value-added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM estimates of current teacher contributions to student learning. More precisely, the falsification test is designed to identify whether or not students are effectively randomly assigned conditional on the covariates included in the model. Rothstein's finding is significant because there is considerable interest in using VAM teacher effect estimates for high-stakes teacher personnel policies, and the results of the Rothstein test cast considerable doubt on the notion that VAMs can be used fairly for this purpose. However, in this paper, we illustrate—theoretically and through simulations—plausible conditions under which the Rothstein falsification test rejects VAMs even when students are randomly assigned, conditional on the covariates in the model, and even when there is no bias in estimated teacher effects.
Citation: Dan Goldhaber, Duncan Chaplin (2012). Assessing the “Rothstein Test”. Does it Really Show Teacher Value-Added Models are Biased?. CALDER Working Paper No. 71
Full Abstract
While prior research has documented differences in the distribution of teacher characteristics across schools serving different student populations, few studies have examined how teacher sorting occurs within schools. Comparing teachers who teach in the same grade and school in a given year, the authors find that less experienced, minority, and female teachers are assigned students with lower average prior achievement, more prior behavioral problems, and lower prior attendance rates than their more experienced, white and male colleagues. Though more effective (higher value-added) teachers and those with advanced degrees are also assigned less difficult classes, controlling for these factors does not eliminate the association between experience, race, gender, and assignments. These patterns have negative implications for teacher retention given the importance of working conditions for teachers' career decisions.
Citation: Demetra Kalogrides, Susanna Loeb, Tara Beteille (2011). Power Play? Teacher Characteristics and Class Assignments. CALDER Working Paper No. 59
Full Abstract
This paper assesses the determinants of teacher job change and the impact of such mobility on the distribution of teacher quality. High and low-quality teachers are more likely to leave than those in the middle of the distribution. In contrast, the relationship between teacher productivity and inter-school mobility is relatively weak. Teachers who rank above their faculty colleagues are more likely to transfer to a new school within a district and exit teaching. As the share of peer teachers with more experience, advanced degrees or professional certification increase, the likelihood of moving within district decreases. There is also evidence of assortative matching among teachers. The most effective teachers who transfer tend to go to schools whose faculties are in the top quartile of teacher quality. Teacher mobility exacerbates differences in teacher quality across schools.
Citation: Li Feng, Tim Sass (2011). Teacher Quality and Teacher Mobility. CALDER Working Paper No. 57
Full Abstract
Research on teacher productivity, and recently developed accountability systems for teachers, rely on value-added models to estimate the impact of teachers on student performance. The authors test many of the central assumptions required to derive value-added models from an underlying structural cumulative achievement model and reject nearly all of them. Moreover, they find that teacher value added and other key parameter estimates are highly sensitive to model specification. While estimates from commonly employed value-added models cannot be interpreted as causal teacher effects, employing richer models that impose fewer restrictions may reduce the bias in estimates of teacher productivity.
Citation: Douglas Harris, Tim Sass, Anastasia Semykina (2010). Value-Added Models and the Measurement of Teacher Productivity. CALDER Working Paper No. 54
Full Abstract
Most analyses of teacher quality end without any assessment of the economic value of altered teacher quality. This paper begins with an overview of what is known about the relationship between teacher quality and student achievement. Alternative valuation methods are based on the impact of increased achievement on individual earnings and on the impact of low teacher effectiveness on economic growth through aggregate achievement. A teacher one standard deviation above the mean effectiveness annually generates marginal gains of over $400,000 in present value of student future earnings with a class size of 20 and proportionately higher with larger class sizes. Replacing the bottom 5-8 percent of teachers with average teachers could move the U.S. near the top of international math and science rankings with a present value of $100 trillion.
Citation: Eric Hanushek (2010). The Economic Value of Higher Teacher Quality. CALDER Working Paper No. 56