You are here
High school graduation rates in the United States are at an all-time high, yet many of these graduates are deemed not ready for postsecondary coursework when they enter college. This study examines the short-, medium-, and long-term effects of remedial courses in middle school using a regression discontinuity design. While the short-term test score benefits of taking a remedial course in English language arts in middle school fade quickly, I find significant positive effects on the likelihood of taking college credit-bearing courses in high school, college enrollment, enrolling in more selective colleges, persistence in college, and degree attainment.
Citation: Umut Özek (2021). The Effects of Middle School Remediation on Postsecondary Success: Regression Discontinuity Evidence from Florida. CALDER Working Paper No. 258-0921
We use statewide administrative data from Missouri to document the prevalence of Industry Recognized Credential (IRC) programs in public high schools and understand the characteristics of students who complete IRCs. We show that 9 percent of Missouri students complete an IRC during their senior year of high school. IRC completers have lower achievement and are more likely to be disadvantaged along several measurable dimensions relative to their peers who complete analog college-ready programs, on average. Noting these average relationships, there is substantial heterogeneity among individual IRCs in terms of the types of students served: some IRCs attract students with high test scores who mostly go on to attend college, whereas others serve low-scoring students who mostly forego college. There is strong gender segregation across individual IRCs that aligns with gender segregation across occupations in the labor market.
Citation: Joshua Eagan, Cory Koedel (2021). Career Readiness in Public High Schools: An Exploratory Analysis of Industry Recognized Credentials. CALDER Working Paper No. 257-0921
What does it mean for students to be in a gifted program? While about 7% of students nationally participate in gifted programs, relatively little is known about the experiences of students in these programs or how they vary across districts. Combining administrative and survey data, we describe the structure of gifted programs across nearly 300 school districts in Washington State. Using covariate adjustments and student fixed effects, we find that participation in gifted programs increases access to advanced courses, high-achieving peers, smaller classrooms, and more qualified teachers. These effects are largely concentrated in larger urban and suburban school districts that frequently run large, self-contained gifted programs. Effects of participation are much smaller for small school districts, rural or town school districts, and districts with small gifted programs. While gifted participation changes the educational environment for the average student in the state, the median school district program effect is near zero across the measures of educational environments we consider. This divergence is driven by a pattern of large school districts, high-income school districts, and urban and suburban school districts having programs with significantly larger effects on learning environments. Finally, we find that gifted program effects are larger for some student subgroups, but this is entirely due to district treatment effect heterogeneity, not differential effects on subgroups within districts.
Citation: Benjamin Backes, James Cowan, Dan Goldhaber (2021). What Makes for a "Gifted" Education? Exploring How Participation in Gifted Programs Affects Students' Learning Environments. CALDER Working Paper No. 256-0821
There is empirical evidence of substantial heterogeneity in economic mobility across geographic areas and the efficacy of schools has been suggested as an explanatory factor. Using administrative microdata from seven states covering nearly 3 million students, we explore the potential role of schools in promoting economic mobility by estimating cross-district variation in “academic mobility”—a term we use to describe the extent to which students’ ranks in the distribution of academic performance change during their schooling careers. We show that there exists considerable heterogeneity in academic mobility across school districts. However, after aggregating our district-level measures of academic mobility to the commuting-zone level and merging them with geographically matched external estimates of economic mobility, we find little scope for geographic differences in academic mobility to meaningfully account for differences in economic mobility.
This is an updated version of the paper originally titled "Where are Initially Low-performing Students the Most Likely to Succeed? A Multi-state Analysis of Academic Mobility (Preliminary Draft)", released in February 2020.
Citation: Wes Austin, David Figlio, Dan Goldhaber, Eric Hanushek, Tara Kilbride, Cory Koedel, Jaeseok Sean Lee, Jin Lou, Umut Özek, Eric Parsons, Steven Rivkin, Tim Sass, Katharine Strunk (2021). Academic Mobility in U.S. Public Schools: Evidence from Nearly 3 Million Students. CALDER Working Paper No. 227-0821-2
Testing students and using test information to hold schools and, in some cases, teachers accountable for student achievement has arguably been the primary national strategy for school improvement over the past decade and a half. Tests are also used for diagnostic purposes, such as to predict students at-risk of dropping out of high school. But there is policy debate about the efficacy of this usage, in part because of disagreements about whether tests are an important schooling outcome. We use panel data from three states – North Carolina, Massachusetts and Washington State – to investigate how accurate early test scores are in predicting later high school outcomes: 10th grade test achievement, the probability of taking advanced math courses in high school, and graduation. We find 3rd grade tests predict all of these outcomes with a high degree of accuracy and relatively little diminishment from using 8th grade tests. We also find evidence that using a two-stage model estimated on separate cohorts (one predicting 8th grade information using 3rd grade information, and another predicting high school outcomes with 8th grade information) only slightly diminishes forecast accuracy. Finally, the use of machine learning techniques increases accuracy of predictions over widely used linear models, but only marginally.
Working Paper 235-0520 was originally released in May 2020. This is an updated version, released August 2021.
Citation: Dan Goldhaber, Malcolm Wolff, Timothy Daly (2021). Assessing the Accuracy of Elementary School Test Scores as Predictors of Students’ High School Outcomes. CALDER Working Paper No. 235-0821-2
In this paper we use data from two states—Michigan and Washington—on COVID case rates at the county level linked to information on the type of instructional modality offered by local public school districts to assess the relationship between modality and COVID outcomes. We focus primarily on COVID case rates, but also provide estimates for hospitalizations (in Washington only) and deaths. Our preferred district and month fixed effects models exploit within district (over time) variation in instructional modality and account for time-invariant district factors. In both states, we find evidence that instructional modality does lead to increases in COVID spread in communities with moderate to high levels of pre-existing COVID cases, although the causal effect is small in magnitude.
Working Paper No. 247-1220 was originally released in December 2020 and has since been updated to Working Paper No. 247-1220-3, released in July 2021.
Citation: Dan Goldhaber, Scott A. Imberman, Katharine Strunk, Bryant Hopkins, Nate Brown, Erica Harbatkin, Tara Kilbride (2021). To What Extent Does In-Person Schooling Contribute to the Spread of COVID-19? Evidence from Michigan and Washington. CALDER Working Paper No. 247-0721-3
We used survey and administrative data from Washington State to assess the degree to which special education teacher preparation, district literacy instructional practices, and the alignment between preparation and practice were associated with the reading test score gains of students with high-incidence disabilities taught by early-career special education teachers in grades 4-8. These students tended to have larger reading gains when their district emphasized evidence-based literacy decoding practices (e.g., phonological awareness, phonics, and reading fluency) and when their special education teacher graduated from a teacher education program that also emphasized these practices. Students with high-incidence disabilities in districts that emphasized balanced literacy practices tended to have lower reading gains. Finally, students with high-incidence disabilities taught by early-career special education teachers tended to have larger reading gains when their teacher’s student teaching placement was supervised by a more experienced cooperating teacher.
Citation: Roddy Theobald, Dan Goldhaber, Kristian Holden, Marcy Stein (2021). Special Education Teacher Preparation, Literacy Instructional Alignment, and Reading Achievement for Students with High-Incidence Disabilities. CALDER Working Paper No. 253-0621
Free and reduced-price meal eligibility (FRM) is commonly used in education research and policy applications as an indicator of student poverty. However, using multiple data sources external to the school system, we show that FRM status is a poor proxy for poverty, with eligibility rates far exceeding what would be expected based on stated income thresholds for program participation. This is true even without accounting for community eligibility for free meals, although community eligibility has exacerbated the problem in recent years. Over the course of showing the limitations of using FRM data to measure poverty, we provide promising validity evidence for a new, publicly-available measure of school poverty based on local-area family incomes.
This Working Paper was orginally posted under the title "Free and Reduced-Price Meal Eligibility Does Not Measure Student Poverty"
Citation: Ishtiaque Fazlul, Cory Koedel, Eric Parsons (2021). Free and Reduced-Price Meal Eligibility Does Not Measure Student Poverty: Evidence and Policy Significance. CALDER Working Paper No. 252-0521
We study the effect of exposure to immigrants on the educational outcomes of US-born students, using a unique dataset combining population-level birth and school records from Florida. This research question is complicated by substantial school selection of US-born students, especially among White and comparatively affluent students, in response to the presence of immigrant students in the school. We propose a new identification strategy to partial out the unobserved non-random selection into schools, and find that the presence of immigrant students has a positive effect on the academic achievement of US-born students, especially for students from disadvantaged backgrounds. Moreover, the presence of immigrants does not affect negatively the performance of affluent US-born students, who typically show a higher academic achievement compared to immigrant students. We provide suggestive evidence on potential channels.
Citation: David Figlio, Paola Giuliano, Riccardo Marchingiglio, Umut Özek, Paola Sapienza (2021). Diversity in Schools: Immigrants and the Educational Performance of U.S. Born Students. CALDER Working Paper No. 250-0321
States are responsible for setting and evaluating the standards that teacher preparation programs (TPPs) must meet for accreditation. Despite the considerable investment that states make in this process, no prior research has linked the ratings of TPPs generated by program reviews to inservice teacher performance. In this paper, we describe analyses of program review ratings from Massachusetts and their relationship to formal inservice teacher evaluation ratings and the value-added effectiveness of teachers. When comparisons are made across all schools and districts in the state, we find that a TPP’s review scores are positively predictive of both inservice teacher evaluations and value added of TPP graduates, particularly when scores are aggregated within specific categories like partnerships and fieldbased practices. These relationships, however, become more modest for teacher evaluations and statistically insignificant for value added when the relationships are identified based on comparisons between TPP graduates who are teaching in the same schools and districts. It is not possible to separate whether these differences are due to the TPPs, the schools and districts themselves, or the connections between them, so future work is necessary to further validate TPP review scores in this setting and others.
Citation: Meagan Comb, James Cowan, Dan Goldhaber, Zeyu Jin, Roddy Theobald (2021). State Ratings of Educator Preparation Programs: Connecting Program Review to Teacher Effectiveness. CALDER Working Paper No. 249-0321
We evaluate the feasibility of estimating test-score growth for schools and districts with a gap year in test data. Our research design uses a simulated gap year in testing when a true test gap did not occur, which facilitates comparisons of district- and school-level growth estimates with and without a gap year. We find that growth estimates based on the full data and gap year data are generally similar, establishing that useful growth measures can be constructed with a gap year in test data. Our findings apply most directly to testing disruptions that occur in the absence of other disruptions to the school system. They also provide insights about the test stoppage induced by COVID-19, although our work is just a first step toward producing informative school- and district-level growth measures from the pandemic period.
This is an updated version of the paper originally titled "Estimating Test-Score Growth with a Gap Year in the Data", released in January 2021.
This paper has been published in AERA Open and can be viewed here, August 2021.
Citation: Ishtiaque Fazlul, Cory Koedel, Eric Parsons, Cheng Qian (2021). Estimating Test-Score Growth for Schools and Districts with a Gap Year in the Data. CALDER Working Paper No. 248-0121-2
How much do teachers value compensation deferred for retirement (CDR)? This question is important because the vast majority of public school teachers are covered by defined benefit (DB) pension plans that “backload” a large share of compensation to retirement relative to the compensation structure in the private sector, and there is scant evidence about whether pension structures are consistent with teacher preferences for current compensation versus CDR. This study examines a unique setting in Washington State, where teachers are enrolled in a hybrid pension system that has both DB and defined contribution (DC) components. We exploit the fact that teachers have choices over their DC contribution rate to infer their revealed preferences for current versus CDR. We find that teachers on average contribute 7.23 percent of salary income toward retirement; 62 percent in fact elect to contribute more than the minimally required contribution of 5 percent. This suggests that teachers value CDR far more than suggested by prior evidence.
Working paper 242-0920 was originally released in September 2020 under the title "How Much do Teachers Value Deferred Compensation? Evidence from Defined Contribution Rate Choices". This is an updated version, released April 2021.
Citation: Dan Goldhaber, Kristian Holden (2020). How Much do Teachers Value Compensation Deferred for Retirement? Evidence from Defined Contribution Rate Choices. CALDER Working Paper No. 242-0920-2
Prior work has documented a substantial penalty associated with taking the Partnership for Assessment of Readiness for College and Careers (PARCC) online relative to on paper (Backes & Cowan, 2019). However, this penalty does not necessarily make online tests less useful. For example, it could be the case that computer literacy skills are correlated with students’ future ability to navigate high school coursework, and thus more predictive of later outcomes. Using a statewide implementation of PARCC in Massachusetts, we test the relative predictive validity of online and paper tests. We are unable to detect a difference between the two and in most cases can rule out even modest differences. Finally, we estimate mode effects for the new Massachusetts statewide assessment. In contrast to the first years of PARCC implementation, we find very small mode effects, showing that it is possible to implement online assessments at scale without large online penalties.
Citation: Benjamin Backes, James Cowan (2020). Is Online a Better Baseline? Comparing the Predictive Validity of Computer- and Paper-Based Tests. CALDER Working Paper No. 241-0820
Many states enhanced benefits in teacher retirement plans during the 1990s. This paper examines the school staffing effects of one such enhancement in a major urban school district with mostly high poverty schools. Pension rule changes in 1999 for St. Louis public school teachers resulted in very large increases in pension wealth for active teachers, as well as a powerful increase in “push” incentives for earlier retirement. Simple descriptive statistics on retirement patterns before and after the enhancements suggest much earlier retirement resulted. Shorter teaching spells imply a steady state with more teaching vacancies and a larger share of novice teachers in classrooms. To better understand the long run effects of these changes and alternatives policies, the authors estimate a structural model of teacher retirement. Simulations of retirement behavior for a representative senior teacher point to shorter completed teaching spells and earlier retirement age as a result of the enhancements. By contrast, moving from the post-1999 to a DC- type plan would extend the teaching career of a representative senior teacher by roughly three years. Simulations of voluntary DC conversation plans suggest that many senior teachers would enroll, thereby reducing workforce turnover, and overall pension costs.
Citation: Shawn Ni, Michael Podgursky, Xiqian Wang (2020). Teacher Pension Enhancements and Staffing in an Urban School District. CALDER Working Paper No. 240-0620
The clinical teaching experience is one of the most important components of teacher preparation. Prior observational research has found that more effective mentors and schools with better professional climates are associated with better preparation for teacher candidates. We test these findings using an experimental assignment of teacher candidates to placement sites in two states. Candidates who were randomly assigned to higher quality placement sites experienced larger improvements in performance over the course of the clinical experience, as evaluated by university instructors. The findings suggest that improving clinical placement procedures can improve the teaching quality of candidates.
Citation: Dan Goldhaber, Matt Ronfeldt, James Cowan, Trevor Gratz, Emanuele Bardelli, Matt Truwit, Hannah Mullman (2020). Room for Improvement? Mentor Teachers and the Evolution of Teacher Preservice Clinical Evaluations. CALDER Working Paper No. 239-0620
Defined benefit (DB) pension plans incentivize “salary spiking,” where sharp increases in pay are leveraged into significantly higher levels of retirement compensation. While egregious instances of salary spiking occasionally make headlines, the prevalence of salary spiking is poorly understood. Moreover, there is little guidance on the definition of salary spiking behavior and how to identify it. This paper develops an empirical method to quantify the prevalence of salary spiking by identifying cases where end-of-career compensation deviates from the expected level of compensation. We apply this method to teacher pension systems in Illinois to assess the prevalence of salary spiking before and after the implementation of a reform designed to dissuade salary spiking.
Working paper 238-0620 was originally released in June 2020 under the title "A Method for Identifying Salary Spiking: An Assessment of Pensionable Compensation and Reform in Illinois". This is an updated version, released April 2021.
Citation: Dan Goldhaber, Cyrus Grout, Kristian Holden (2020). Identifying Teacher Salary Spiking and Assessing the Impact of Pensionable Compensation Reforms in Illinois. CALDER Working Paper No. 238-0620-2
There is growing interest in using measures of teacher applicant quality to improve hiring decisions, but the statistical properties of such measures are poorly understood. We present evidence on structured ratings solicited from teacher applicants’ references. We find that the reference ratings capture only one underlying dimension of applicant quality, which may indicate a need to broaden the range of questions posed to professional references. Point estimates of inter-rater reliability range between 0.23 and 0.31 and are significantly lower for novice applicants. It is difficult to judge whether these levels of reliability are high or low in the current context given so little evidence on comparable applicant assessment tools.
This paper was published in Economics of Education Review in August 2021 and can be found here.
Citation: Dan Goldhaber, Cyrus Grout, Malcolm Wolff, Patricia Martinkova (2020). Evidence on the Dimensionality and Reliability of Professional References’ Ratings of Teacher Applicants. CALDER Working Paper No. 237-0620
We use a novel database of student teaching placements in Washington State to investigate teachers’ transitions from student teaching classrooms to first job classrooms and the implications for student achievement. We find that first-year teachers are more effective when they are teaching in the same grade, in the same school level, or in a classroom with student demographics similar to their student teaching classroom. We also document that only 27% of first-year teachers are teaching the same grade they student taught, and that first-year teachers tend to begin their careers in higher-poverty classrooms than their student teaching placements. This suggests that better aligning student teacher placements with first-year teacher hiring could be a policy lever for improving early-career teacher effectiveness.
This paper was published in Educational Evaluation and Policy Analysis in July 2021 and can be found here.
Citation: John Krieg, Dan Goldhaber, Roddy Theobald (2020). Disconnected Development? The Importance of Specific Human Capital in the Transition from Student Teaching to the Classroom. CALDER Working Paper No. 236-0520
The Community Eligibility Provision (CEP) is a policy change to the federally-administered National School Lunch Program that allows schools serving low-income populations to classify all students as eligible for free meals, regardless of individual circumstances. This has implications for the use of free and reduced-price meal (FRM) data to proxy for student disadvantage in education research and policy applications, which is a common practice. We document empirically how the CEP has affected the value of FRM eligibility as a proxy for student disadvantage. At the individual student level, we show that there is essentially no effect of the CEP. However, the CEP does meaningfully change the information conveyed by the share of FRM-eligible students in a school. It is this latter measure that is most relevant for policy uses of FRM data.
This paper was published in Educatinal Evaluation and Policy Analysis in November 2020 and can be found here.
Note: Portions of this paper were previously circulated under the title “Using Free Meal and Direct Certification Data to Proxy for Student Disadvantage in the Era of the Community Eligibility Provision.” We have since split the original paper into two parts. This is the first part.
WP 234-0320 was originally released in March 2020. This is the third update, WP 234-0320-3, released in September 2020.
Citation: Cory Koedel, Eric Parsons (2020). The Effect of the Community Eligibility Provision on the Ability of Free and Reduced-Price Meal Data to Identify Disadvantaged Students. CALDER Working Paper No. 234-0320-3
This study examines the effects of internal migration driven by severe natural disasters on host communities, and the mechanisms behind these effects, using the large influx of migrants into Florida public schools after Hurricane Maria. I find adverse effects of the influx in the first year on existing student test scores, disciplinary problems, and student mobility among high-performing students in middle and high school that also persist in the second year. I also find evidence that compensatory resource allocation within schools is an important factor driving the adverse effects of large, unexpected migrant flows on incumbent students in the short-run.
This paper was published in The Journal of Human Resources in January 2021 and can be found here.
WP 233-0320 was originally released in March 2020. This updated version, WP 233-0320-2, was released in January 2021.
Citation: Umut Özek (2020). Examining the Educational Spillover Effects of Severe Natural Disasters: The Case of Hurricane Maria . CALDER Working Paper No. 233-0320-2