Sociologist in Fall Creek Place: June 2012

Indiana has a Socialist candidate on the ballot this year for State Representative (Hamilton County, District 39), John Strinka. He had the energy to collect 459 qualifying signatures, and if my experience with collecting signatures for ballot access is any guide, he may have collected triple this. The question occurred to me, 'What kind of neighborhoods vote for Socialist candidates here in the Midwest?" Ohio recently (2010) had a Socialist make the ballot for U.S. Senator, Dan La Botz. This creates a vast source of data, since there is precinct-level voting data for the entire state, over 11 million people. Theoretically, I could use GIS software to create a map overlay of the precinct data on top of Census and American Community Survey data, which is collected at the block and block-group level.

Unfortunately, I am not that energetic, since I could only find precinct-level data for each individual county (88), which would have to be downloaded separately and merged (never as easy as it should be), then having to deal with merging shape files that introduces error into the process for every intersection and spatial measurement variation of block groups (over 9,0000) vs precincts (also over 9,000). I found an analysis by Mark Salling (Cleveland State University) who broke down the 2010 candidates at the block-level, so utilized his estimates for my analysis (although I would be interested to see how he did this...).

For the uninitiated, the American Community Survey (ACS) is an ongoing, random-sampled survey with far more detailed questions than the ten-year Census, although the ACS is part of the Census bureaucracy. When trying to access this much data (10,000 block groups), you need to use an ftp site, not the well-designed GUI access at factfinder2.census.gov. This contains 118 different *.txt files with "just" data, no labels, but you must then merge with separate files that contain labels, then merge that file with yet another file that contains the geography block codes. A very tedious process, indeed. But the result is access to a wealth of data. Each of the 118 files has a several hundred specific questions on different topics, such as educational attainment, income, occupation, length of time it takes you to get to work, etc. The file that contains the labels for the data has almost 24,000 rows, each one being a variable label. LOTS of data. This is *just* the ACS data--the Census data is a completely separate process with another wealth of different variables.

I used a fairly simple analysis method to compare demographic data with Salling's analysis of block-level Socialist votes--data mining several of the files for correlations. I did not test all 24,000 variables, but focused on 10 specific domains I thought would yield the best results. Most correlations were < 0.05, completely insignificant, but some approached 0.20, still not that scientifically astounding, but perhaps interesting to explore. I chose correlations > 0.10, and combined similar high scoring variables to create more robust correlations.

While there are several limitations at various steps in this process, here are some of the stronger correlations I found in block-group demographics for the Ohio 2010 Senate election who voted for the Socialist candidate (single variables only, I do not include here the compound variables I created). While I list very specific variables below, such as "females over 25 with an MA degree," they represent broader patterns obvious within the data, matched to similar variables. So, for example, this particular variable only has a correlation of .14, but there was a broader pattern of similar correlations with higher education in both men and women. The opposite is true of low levels of education, such as the matching variable, "females over 25 with only a high school degree," having a correlation of -.08, with several measures of less than college-level education for both men and women having consistently negative correlations. In this analysis, positive correlations mean this block-group tends to vote Socialist, while negative correlations mean this block group tends not to vote Socialist.

Correlation	Variables
0.14	Females over 25 with a MA
-0.08	Females with only a HS education
0.13	Men who usually worked 15-34 hours per week during the last year
-0.12	Women not in the labor force during the last year
0.15	Male professionals: Management, business, education, law, arts, sciences, media
-0.18	Men in construction, maintenance, production, transportation, natural resources
0.22	Asians (in this particular dataset, no differentiation for ethnicities)
0.19	Renters
-0.19	Those who live in neighborhoods with high home vacancies

Income data was interesting. For example, for "median family income" there was only a correlation of 0.07, representing a fairly poor relationship to voting Socialist. However, the table below of income groupings clearly shows a relationship not picked up by the correlation analysis:

Median family income	Percent who voted Socialist
$2.5k-$30k	0.18%
$31k-$45k	0.15%
$46k-$65k	0.18%
$66k-100k	0.22%
$100k+	0.28%

The income data was surprising in that I expected the higher income levels to vote more conservatively. However, the data showed the opposite, except for the poorest, who voted liberally (Socialist) in higher rates, along with the wealthier groups. The middle class, in this case, tended to vote the most conservatively. Part of this may be a factor of education, since social scientists have long known that liberal voting patterns follow from higher levels of education. This is demonstrated in this Ohio data set with the education voting patterns, matching expectations. However, another surpise for me was that the "trade" occupations were least likely to vote Socialist--a surprise because it is these jobs that historically have most benefitted from unions, which are rooted historically in Socialism. However, given that Democrats are the current powerhouse for defenders of union rights, the trades also have the most to lose if Democrats lose, and voting Socialist may risk handing victory to Republicans--at least this was the way I explained the data to myself. In terms of employment data, it seems reasonable that men who can only find part-time work may be most likely to be dissatisfied with the status quo and vote a 3rd party, while the women not in the workforce may represent the more conservative "stay at home mom" group, so would avoid voting Socialist. I found very little relationship with race and voting Socialist (except for the strong Asian correlation)--perhaps the same phenomenon that prevents Blacks and Hispanics from voting Socialist is the same phenomenon that prevents trade workers from voting Socialist--they fear they have the most to lose if the 3rd party takes votes from Democrats, handing victory to Republicans.

An updated table of data I have posted earlier. Yellow (state/local consumption/investment) + Blue (federal consumption/investment) = Green (total) as a percent of GDP, data from Federal Reserve, June 2012. Red dotted line is total tax revenue as percent of GDP, data from OMB, June 2012.

Notice that there has been relatively little change in government spending as a % of GDP since the 1950s. Total government spending has remained between 17%-24%, with the lowest levels during that period under Clinton (1998, 17.4%), and the highest under Truman (1953, 23.9%) and Kennedy (1967, 23.1%). As of 2011 we are at 20.1%.

Similarly, tax revenues as a % of GDP have been relatively unchanged. The highest levels of tax revenue have also been under Bush Jr (2000, 20.6%), and the lowest under Obama (2009, 15.1%).

Sociologist in Fall Creek Place

Saturday, June 30, 2012

Who Votes Socialist Today?

Thursday, June 7, 2012

Government Spending as a Percent of GDP from 1952-2011