- Alex Coppock, Alan S. Gerber, Donald P. Green, and Holger L. Kern. Forthcoming. "Combining double sampling and bounds to address non-ignorable missing outcomes in randomized experiments." Political Analysis.
Missing outcome data plague many randomized experiments. Common solutions rely on ignorability assumptions that may not be credible in all applications. We propose a method for confronting missing outcome data that makes fairly weak assumptions but can still yield informative bounds on the average treatment effect. Our approach is based on a combination of the double sampling design and non-parametric worst-case bounds. We derive a worst-case bounds estimator under double sampling and provide analytic expressions for variance estimators and confidence intervals. We also propose a method for covariate adjustment using post-stratification and a sensitivity analysis for non-ignorable missingness. Finally, we illustrate the utility of our approach using Monte Carlo simulations and a placebo-controlled randomized field experiment on the effects of persuasion on social attitudes with survey-based outcome measures.
Randomized experiments are considered the gold standard for causal inference because they can provide unbiased estimates of treatment effects for the experimental participants. However, researchers and policymakers are often interested in using a specific experiment to inform decisions about other target populations. In education research, increasing attention is being paid to the potential lack of generalizability of randomized experiments because the experimental participants may be unrepresentative of the target population of interest. This article examines whether generalization may be assisted by statistical methods that adjust for observed differences between the experimental participants and members of a target population. The methods examined include approaches that reweight the experimental data so that participants more closely resemble the target population and methods that utilize models of the outcome. Two simulation studies and one empirical analysis investigate and compare the methods’ performance. One simulation uses purely simulated data while the other utilizes data from an evaluation of a school-based dropout prevention program. Our simulations suggest that machine learning methods outperform regression-based methods when the required structural (ignorability) assumptions are satisfied. When these assumptions are violated, all of the methods examined perform poorly. Our empirical analysis uses data from a multisite experiment to assess how well results from a given site predict impacts in other sites. Using a variety of extrapolation methods, predicted effects for each site are compared to actual benchmarks. Flexible modeling approaches perform best, although linear regression is not far behind. Taken together, these results suggest that flexible modeling techniques can aid generalization while underscoring the fact that even state-of-the-art statistical techniques still rely on strong assumptions.
Formal models of revolutionary collective action suggest that ‘informational cascades’ play a crucial role in overcoming collective action problems. These models highlight how information about the aggregate level of participation in collective action conveys information about others’ political preferences, and how such informational cues allow potential participants to update their beliefs about the value of participating in antiregime collective action. In authoritarian regimes, foreign mass media are often the only credible source of information about antiregime protests. However, limited robust evidence exists on whether foreign media can indeed serve as a coordination device for collective action. This article makes use of a detailed dataset on protest events during the 1989 East German revolution and exploits the fact that West German television broadcasts could be received in most but not all parts of East Germany. Across a wide range of Cox proportional hazards models and conditional on a rich set of observables, it finds that the availability of West German television did not affect the probability of protest events occurring. The evidence presented here does not support the widely accepted ‘fact’ that West German television served as a coordination device for antiregime protests during the East German revolution. More broadly, it also calls into question strong claims about the effects of communication technology on revolutionary collective action.
Survey experimenters routinely test for systematically varying treatment effects by using interaction terms between the treatment indicator and covariates. Parametric models, such as linear or logistic regression, are currently used to search for systematic treatment effect heterogeneity but suffer from several shortcomings; in particular, the potential for bias due to model misspecification and the large amount of discretion they introduce into the analysis of experimental data. Here, we explicate what we believe to be a better approach. Drawing on the statistical learning literature, we discuss Bayesian Additive Regression Trees (BART), a method for analyzing treatment effect heterogeneity. BART automates the detection of nonlinear relationships and interactions, thereby reducing researchers’ discretion when analyzing experimental data. These features make BART an appealing "off-the-shelf" tool for survey experimenters who want to model systematic treatment effect heterogeneity in a flexible and robust manner. In order to illustrate how BART can be used to detect and model heterogeneous treatment effects, we reanalyze a well-known survey experiment on welfare attitudes from the General Social Survey.
Do foreign media facilitate the diffusion of protest in authoritarian regimes? Apparently for the first time, the author tests this hypothesis using aggregate and survey data from communist East Germany. The aggregate-level analysis takes advantage of the fact that West German television broadcasts could be received in most but not all parts of East Germany. The author exploits this "natural experiment" by conducting a matched analysis in which counties without West German television are matched to a comparison group of counties with West German television. Comparing these two groups of East German counties, the author finds no evidence that West German television affected the speed or depth of protest diffusion during the 1989 East German revolution. He also analyzes a survey of East German college students. Confirming the aggregate-level results, the survey data show that, at least among college students, exposure to West German television did not increase protest participation.
Randomized experiments commonly compare subjects receiving a treatment to subjects receiving a placebo. An alternative design, frequently used in field experimentation, compares subjects assigned to an untreated baseline group to subjects assigned to a treatment group, adjusting statistically for the fact that some members of the treatment group may fail to receive the treatment. This article shows the potential advantages of a three-group design (baseline, placebo, and treatment). We present a maximum likelihood estimator of the treatment effect for this three-group design and illustrate its use with a field experiment that gauges the effect of prerecorded phone calls on voter turnout. The three-group design offers efficiency advantages over two-group designs while at the same time guarding against unanticipated placebo effects (which would undermine the placebo-treatment comparison) and unexpectedly low rates of compliance with the treatment assignment (which would undermine the baseline-treatment comparison).
This article uses British Household Panel Survey data to estimate the effects of divorce and widowhood on political attitudes and political behavior. In contrast to previous research, which mostly relied on cross-sectional data, a matched propensity score analysis does not find any effects of transitions out of marriage on policy preferences, party identification, or vote choice. The results also show that divorce (but not widowhood) substantially reduces electoral participation. Some preliminary evidence suggests that this effect of divorce on turnout is partially attributable to the increased residential mobility that accompanies divorce.
Regression discontinuity (RD) designs enable researchers to estimate causal effects using observational data. These causal effects are identified at the point of discontinuity that distinguishes those observations that do or do not receive the treatment. One challenge in applying RD in practice is that data may be sparse in the immediate vicinity of the discontinuity. Expanding the analysis to observations outside this immediate vicinity may improve the statistical precision with which treatment effects are estimated, but including more distant observations also increases the risk of bias. Model specification is another source of uncertainty; as the bandwidth around the cutoff point expands, linear approximations may break down, requiring more flexible functional forms. Using data from a large randomized experiment conducted by Gerber, Green, and Larimer (2008), this study attempts to recover an experimental benchmark using RD and assesses the uncertainty introduced by various aspects of model and bandwidth selection. More generally, we demonstrate how experimental benchmarks can be used to gauge and improve the reliability of RD analyses.
In this case study of the impact of West German television on public support for the East German communist regime, we evaluate the conventional wisdom in the democratization literature that foreign mass media undermine authoritarian rule. We exploit formerly classified survey data and a natural experiment to identify the effect of foreign media exposure using instrumental variable estimators. Contrary to conventional wisdom, East Germans exposed to West German television were more satisfied with life in East Germany and more supportive of the East German regime. To explain this surprising finding, we show that East Germans used West German television primarily as a source of entertainment. Behavioral data on regional patterns in exit visa applications and archival evidence on the reaction of the East German regime to the availability of West German television corroborate this result.
In this paper we demonstrate empirically that incumbency is a source of spillover effects in Germany's mixed electoral system. Using a quasi-experimental research design that allows for causal inferences under a weaker set of assumptions than the regression models commonly used in the electoral systems literature, we find that incumbency causes a gain of 1.4-1.7 percentage points in PR vote shares. We also present simulations of Bundestag seat distributions to show that spillover effects caused by incumbency are sufficiently large to trigger significant shifts in parliamentary majorities.
This paper takes a fresh look at the midterm loss in German elections and argues that government type is a crucial determinant of midterm loss. Using panel regressions on a newly compiled data set covering all state elections during the period 1949–2004, we find that systematic midterm losses occur only when both chambers of the federal legislature (Bundestag and Bundesrat) are controlled by one party or a party coalition. Prior research has failed to discover this important regularity. These findings lend strong support to electoral balancing models while calling into doubt more traditional explanations of midterm loss.
- Charles Crabtree, Chris Fariss, and Holger L. Kern. 2016. "Truth replaced by silence: An internet experiment on private censorship in Russia." Under review.
- Charles Crabtree, Holger L. Kern, and Steven Pfaff. 2016. "Mass media and the diffusion of collective action in authoritarian regimes: The June 1953 East German uprising." Revise & resubmit at International Studies Quarterly.
- Charles Crabtree and Holger L. Kern. 2017. "Using electromagnetic signal propagation models for radio and television broadcasts: An introduction." Under review.
- Steven Pfaff, Charles Crabtree, Holger L. Kern, Chris Fariss, and Jason Jones. "Religious affiliation and discrimination in American public schooling: A field experiment to assess bias in K-12 principals."
- Charles Crabtree, Holger L. Kern, and David Siegel. "Cults of personality, preference falsification, and the dictator's dilemma."
Work in Progress
- Charles Crabtree, Steven Pfaff, Holger L. Kern, Jason Jones, and Chris Fariss. "Experimental evidence on race and class bias in K-12 principals."
- Charles Crabtree and Holger L. Kern. "To print or not to print? Private censorship in Russia."