Doctor of Business Administration (DBA)
The Stephenson Department of Entrepreneurship & Information Systems
With the rise of Big Data analytics, the new field of causal inference (Pearl, 2009) has received more attention in business research fields such as Accounting (Lawrence, Minutti-Meza, & Zhang, 2011) and Marketing (Manganaris, Bhasin, Reid, & Hermiz Keith, 2010). Traditional statistics focuses on correlation which may lead to misleading conclusions because the estimates can be severely biased even when data sets are large. The objective of causal inference is to obtain estimates from observational data that are unbiased and can thus be interpreted as causal. This study provides a systematic comparison of the performance of four causal inference methods which are Propensity Score Matching, Standardization, Inverse Probability Weighting and Orthogonal Arrays. The risk difference, risk ratio and odds ratio are compared for these estimators. This research uses bootstrapping with different sample sizes to ensure that reliable estimates for bias and mean squared error are obtained. Topics relevant to method selection and recommendations for use of the methods are offered.
Additionally, with applying the suggestions and recommendations derived from the simulation, two examples are used to demonstrate how causal inference improves estimates. The first example explores the use of causative analytics for improving retention and graduation rates using a series of causal inference methods with semester-based information about student performance. The findings reveal that the effect of living on campus and math preparation for improving student retention rates and graduation rates is considerably lower than traditional estimates showed. The second example investigates the relationship and effect size between the implementation of the UberX service and fatalities due to drunk driving among different age groups. The findings disclose that while traditional methods show that there is a statistically significant effect of UberX deployment on the number of DWI fatalities among youth ages 17-34 and older ages 35-65, the causal estimates are no longer statistically significant.
Wang, Xuan, "A Comparison of Causal Inference Methods and Their Application in Big Data Analytics" (2018). LSU Doctoral Dissertations. 4613.