Wednesday, 10 December 2014

Multilevel modeling when clusters are heterogeneous. A Monte Carlo comparison of mixed effects random intercept and slope models, cluster-corrected OLS, and two-step approaches

Jan Paul Heisig (WZB), Merlin Schaeffer (WZB), and Johannes Giesecke (HU Berlin)

Social scientists generally rely on three broad modelling strategies to test hypotheses about contextual effects: random intercept and slope (often simply referred to as 'mixed' or simply 'multilevel') models, pooled OLS with cluster-robust standard errors, and two-step approaches. Textbooks tell us that while random intercept and slope models are the most efficient estimator, two-step approaches offer robustness in exchange for inefficiency, and cluster-robust standard errors are situated somewhere in between. But how do these trade-offs play out in actual research settings? To address this question, we go beyond previous Monte-Carlo studies by focusing on more realistic set ups with complex data-generating processes.

The leading scenario that we investigate is that of cross-national comparisons, which are characterized by small numbers of contexts, many observations per context, and considerable heterogeneity of contexts. Our first scenario investigates how the different approaches perform when the effects of (several) control variables vary randomly and normally across contexts. We find that all approaches are unbiased. However, 'simplistic' specifications of mixed and pooled OLS models - which treat the effects of control variables as fixed - can be highly inefficient. Two-step estimation and mixed models with the appropriate random effects structure perform much better. A review of recent published research in major sociology journals suggests that simplistic specifications of mixed effects (and pooled OLS) models may be a common problem.

Literature:

Heisig, Jan Paul, Schaeffer, Merlin and Giesecke, Johannes (2015): "Multilevel Modeling When the Effects of Lower-Level Variables Vary Across Clusters. A Monte-Carlo Comparison of Mixed-Effects Models, Cluster-Robust Pooled OLS and Two-Step Estimation". Available at SSRN: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2703431

Bryan, Mark L. and Stephen P. Jenkins. 2013. Regression Analysis of Country Effects Using Multilevel Data: A Cautionary Tale. IZA Discussion Paper No. 7583