Along with the mapping with Python, I have also generated an R script, pulling data from a repository and running a regression analysis on a total of about 30 demographic, economic and political variables. The strongest, yet most concise model is
Killed: Number of residents killed by police per county from 2013-Oct 2017
Population: The total number of people living in the county in 2015, US Census
Employment: Percent of the county employed from 2011-2015, American Community Survey, US Census
Poverty: Percent of the county living in poverty from 2011-2015, American Community Survey, US Census
According to a pseudo-R^2, these three dependent variables explain about 77% of the variation in the data. The model, and all DVs were significant to p<0.0001. Since the IV was a count variable, literally counting the number of victims per county, I used a negative binomial model. The results indicate just a slightly elevated risk of being killed by police in counties with larger populations, a decreased risk when employment is high, and an increased risk when poverty is high.
The data from Mapping Police Violence, while an amazing resource, needed extensive cleaning. I found 60 counties that did not match the states that were identified, or where no counties were listed. I manually found the correct counties through a Google search. (Edit: I have been informed that all of these issues have been corrected on the original Web site)
No comments:
Post a Comment