The Murder Rate – A Regression with Many Variables
In this post, I want to look at the murder rate, by state. I ran a regression with the state murder rate for 2015 as the dependent variable, and literally threw the kitchen sink at it: demographics, weaponry, income, education, population density, etc. Basically, if its something some reasonable percentage of the population believes matters, and I could find data for it, I threw it into the hopper.
I also included variables relating to immigration status. The latter stems from some from some debate in the comments section to other posts in which I stated my belief that illegal immigrants drive up the crime rate. Several detractors insisted that illegal immigrants have lower, not higher crime rates than the rest of the population, and that I am racist to boot. Before presenting results, I will note – I am not too proud to admit the regression results did not fit with my preconceptions. I am also not too proud to admit the regression results did not fit with the preconceptions of my detractors. Finally, while I am always interested in whatever the data has to say, I suspect my detractors will really, really not the results.
So… without further ado, the output from R:
What does this all mean? Simply put, only two variables are statistically significant at the 5% (or even 10%) level: percent of the population made up of non-Hispanic Whites, and population density. The greater the share of the population made up of non-Hispanic Whites, the lower the murder rate. On the other hand, the greater the population density, the higher the murder rate. To those who don’t use statistics very often, remember – this is taking into account all other variables.
Now, there are a few variables that come close to being statistically significant at the 10% level. In other words, it is possible (not necessarily likely, just possible) that under other circumstances – with a better defined model, or more precise variables – these variables would prove to be statistically significant as well. These variables are:
1. Percent of the population made up foreign citizens here legally. That variable would have a negative effect on the murder rate if it were statistically significant.
2. Percent of the population that is Asian. This variable also would have a negative effect on the murder rate if it were statistically significant.
3. Percent of the population age 18 to 64. Obviously, most of the murders are committed by people within a subset of this range – probably around 18 to 30. If I had the data to separate out this cohort, I believe we would find that the more people in this cohort, the greater the murder rate.
So… what doesn’t matter? First, the percentage of the population made up of illegal immigrants. Ditto the percentage of the population made up of naturalized citizens. These did not increase the murder rate nor lower it. If the murder rate parallels the crime rate in general, then the media narrative that illegal immigrants have lower crime rates than the population as a whole is not supported and to some extent contradicted by the data.
Second, race & ethnicity don’t matter, at least once you pull out non-Hispanic Whites and maybe Asians. Holding all other variables (including education and income) constant, it doesn’t appear that the murder rate differs in a statistically significant way from one non-Hispanic White or Asian racial/ethnic group to another.
Median income doesn’t matter. Neither does the percentage of the population with an income under 20K. Or the percentage of the population with an income over 100K. Or education level. The murder rate is not affected by these variables.
Another thing that doesn’t matter is the degree to which the population happens to be armed. And Lord knows, there are all sorts of variables here. These include “destructive devices” (think grenades, rockets, missiles, mines, poison gas, explosives, or incendiary devices – apparently all these and more are registered by the ATF), machine guns, silencers, short barreled rifles, short barreled shotguns, or other. The innocuous sounding other group includes your garden variety revolvers and pistols.
So essentially, in summary – accounting for education, income, nativity. immigration status, the regression suggests that having more non-Hispanic Whites decreases the murder rate, and having a greater population density increases the murder rate. No other variables in this regression are statistically significant.
Anyway, I can babble on about the results. For example, it would be interesting to see immigrants (both legal and illegal) broken up with enough granularity to see if the results of non-Hispanic Whites and Asians apply to immigrants as well.
But enough of my prattling. What are your thoughts?
As always, if you want my spreadsheet, drop me a line. If you contact me within a month of the publication of this post, I will send it to you and possibly make some sort of witty remark. Since I am adorable, I probably will send you my spreadsheet after that date as well, but I reserve the right to have a file crash, lose my computer, acquire dementia, or die if too much has elapsed. My contact info is my first name (mike) and a dot, then my last name (kimel – only one m there) at gmail dot com.
Links and details to the data are in my spreadsheet. But if you want to replicate it yourself (it was a pain in the butt, but who am I to stop you?) the data are listed below. Where possible (which was the case for only a few exceptions, as noted below), I tried to use 2015 data to match the murder rate.
2014 data on firearms came from Exhibit 8 from this document produced by the Bureau of Alcohol, Tobacco and Firearms.
Population from the Census. 2015 data was used for most purposes. 2014 data was used for firearms per capita data.
Population density from 2010 was obtained from the Census.
2015 median hh income came from the Census.
A number of other variables came from the Census CPS Table Creator. This was used for data on race, income, native v. naturalized citizens v. foreigner, educational attainment, age, and gender.
Pew estimates on illegal immigrants, including Mexican v. non-Mexican, were available for 2014.
Finally, the number of 2015 murders originated with the FBI, but was present in this handy dandy file compiled by the Murder Accountability Project.
Update… April 2, 2017 4:01 PM
I forgot to mention a couple corrections to the data:
1. The Pew data on % of illegal aliens that come from Mexico included a few NAs, in each case for states with a very low percentage of the population being made up of illegal immigrants. In those instances, I assigned the national average share (i.e., 52% of the unauthorized aliens are from Mexico).
2. The CPS table information on race and ethnicity had a few examples where no information was given for a given combination of race & ethnicity. In each case, it was possible to determine that the number was very small because the sum total of the other race & ethnicity combinations came close to 100%. In those instances, I simply replaced the NA with a zero.
It makes sense that population density would increase the murder rate. People clustered closer together have more opportunity, if nothing else.
I believe people clustered closer together have more frustrations as well. Traffic, noise, pollution, externalities and competition all increase when people are in close proximity. Plus, if you assume that the proportion of people who would frustrate you is fixed, you will run more into more people who you find bothersome in a high density environment than a low density environment.
“It makes sense that population density would increase the murder rate.”
I believe the opposite is true in Canada. However, they also have a different distribution of non-Hispanic Whites in Canada.
If you don’t mind my asking, what specifically were your preconceptions and why or what gave you these preconceptions? Or do you know where you got and how?
” I am not too proud to admit the regression results did not fit with my preconceptions.”
So there are two question:
1. What were your preconceptions?
2. Where (or how) did you acquire these preconception?
Less density = higher murder rate in Canada? Seems odd.
You and I just spent a couple of days with me arguing that I felt the data showed that the SCAAP data shows that illegal immigrants commit crimes at a higher rate than citizens, and you arguing that they commit crimes at a lower rate than citizens. A reasonable person would have a pretty good idea of what my preconceptions are, and yours, on this area.
As to how I reached those preconceptions…. well, the media keeps telling me that illegal aliens have a lower crime rate than citizens. But I also remember there are plenty of statistics like this. And I notice that commentary in the media is very biased in one direction. My experience is that when the media is biased in a given direction, it is often collectively wrong.
Since you asked, turnabout is fair play. How did you acquire your preconception on this topic that happen to contradict the output of the regression shown in this post?
The rate of violent crime relative to property crime is constant since 1994 to 2015 (latest Pew data report) at 15%
– 1993 rates (per 100k)
Violent Crime = 747.1
Property Crime =4740
Violent Crime = 372.6
Property Crime = 2487
From the FBI’s report: The national murder rate in 2013 was 4.6 (per 100k), or ~ <10% of the Violent Crime rate and thus ~<2% of the property crime rates in the same year.
There's a huge difference in murders regionally in the US so that a regression of murder rates to all other variables will not show the relationships of other variables to murder rates for regions as it does for a national regression.
The following is from the above FBI source:.
◾There were 4.5 murders per 100,000 people. The murder rate fell 5.1 percent in 2013 compared with the 2012 rate. The murder rate was down from the rates in 2009 (10.5 percent) and 2004 (18.3 percent). (See Tables 1 and 1A.)
◾Of the estimated number of murders in the United States, 43.8 percent were reported in the South, 21.4 percent were reported in the Midwest, 21.0 percent were reported in the West, and 13.8 percent were reported in the Northeast. (See Table 3.)
Twice as many murders in the South as in Midwest, where the population density is greater in the south as Midwest so it's not clear without population normalization what the difference in murder rates would be, if any.
The West has same number of murders as Midwest even though the West includes metro LA, SF Seattle Not sure whether those metro area's change the murder rates (per 100k) in the West vs Midwest.
Not just incidentally, the US murder rate is ~5 times that of Germany and ~2 times that of the UK (2013 data) I wonder if that has something to do with our 2nd Amendment ?
I’m asking what your preconception was .. not what reasonable people might assume from your post or other posts. I’m asking your explicit preconception in your own words.
I think this is relevant to your post since in that post you made no mention of having any preconceptions at all.. which I believe misleads readers when they read your post that has in fact no foundation to support the conclusion you made and kept trying make, and which apparently matched your preconceptions.
As to your reasons for your preconceptions:
You don’t believe the media because you focus on outliers (such as the guy whose kid was killed by a hit-run in LA who is the only source that says illegals have more hit-runs than legals and citizens.
By your own words you’re looking for outliers who disagree with teh mainstream media reports. You even think believe that the media is biased in one direction… which you allege is a “bias” when it is just as well simply reporting unbiased information. So you have a preconception that the mainstream media is “wrong”. and reporting false information or shifting the data it reports to be opposite what you want it to report.
So that is already a preconception .. and thus you haven’t answered where your get your preconception in the first place.
So I’m still waiting for both questions I posed to be answered.
1. What were your preconceptions?
2. Where (or how) did you acquire these preconception?
On your question to me.
I don’t have any preconceptions.. .I go by the data and evidence.. I’m an analytic, trained in science and spent a career in engineering. My success in that career was only because I used analytic analysis of data and ignored other’s preconceptions “that cant be done” “that’s not possible”, “that’s pie in the sky”, “you’re dreamin'” . Their preconceptions biased how they viewed objective data.. biasing their interpretions to their preconceptions.
On the topic of illegals and crime .. I’ve live in CA, NorCal to be more specific. I’ve heard your kind of bullshit about illegals since 1964 and so have been very familiar with the facts since at least that time.
Also as a youth I lived in the CA central valley where the population of our city was ~ 50% illegals (wet-backs). Even though there was rampant and overt discrimination by the whites, segregated school districts, etc. I happened to become well acquainted with the people on the other “side of the tracks” by some random chance event and found much to my surprise that they and their parents, older brothers, uncles, etc. were nothing like I’d been told by everybody else (all the racist whites). That’s when I probably figured out that racism existed which hadn’t known existed until then… that racist beliefs were unfounded in fact by my own direct experience… I think I was 11 year old at that time.
In high school (in Europe at an American High School) there was black/white racism among the American kids at school. I wasn’t partial to it though from my experience in central CA as a youngster. I dated a black girl for 6 months. I attended black kids b-day parties, I danced with the black girls. I sat next to blacks on the bus. Not exclusively.. I just made no distinctions by race.
I had several close German friends I spent a lot of time with. I found they were highly racist about blacks (GI’s in Europe). So I introduced my German friends to my black friends and we did a few things together… and lo and behold in less than 6 months of a few interactions the German kids were completely turned about… experience and direct interactions make a huge difference in pre-conceptions and prejudices people hold.
One of my American high school buddies who had grown up in Virginia before being in Europe with blacks on the same bus, and in the same classes and dances, etc. confessed to me years later that it took him 3 years to realize that the blacks were not at all what he had been taught growing up in Virginia. He said until then he was an out and out racists, but changed over a course of 3 years actual direct interactions and experiences with blacks in a social and academic environment.
As far as people are concerned I find they’re all the same irrespective of color, heritage, shape of nose or type of hair or slant of eye…. they are no different than anybody else. There are as many bad-guy whites as there are blacks and Hispanics.. My brother-in-law is Puerto-Rican.. we met when still in college… he came from a poor family whose parents worked at the canneries and lived in the poor hispanic part of town at that time.. with 4 brothers and sisters. He and his brothers and sisters were poor, but otherwise had the same aspirations, attitudes and desires as I did. My brother-in-law and his brothers and sister andI have talked about the racism and reasons for it for years as adults… and we all agree that it’s an unfounded prejudice, lack of interactions, differences in incomes, schools, opportunities not made available by overt prejudices, fears on both sides by lack of familiarity and by word of mouth. The obstacles in my brother-in-law’s path were huge.. persistent and unjust… most people would have given up trying to get through the racial division. His sister became an ultra-wealthy entrepreneur in her own business (multiple homes In Europe, East Coast and West Coast .. mansions, best cars and exotic sports cars, etc… and gave that up at 35, sold her business, to go back to school for a law degree.. she’s a judge now in LA. Why? Because she felt strongly that she had to do what she could to right the injustices of racism and that by become a lawyer (public defender) she could do far more good than giving her money to charity (which she also did in the millions of dollars).
So perhaps you use subjective preconceptions which come friom somewhere (I’m waiting to find out where they come from) , but I don’t… if I have them and somebody informs me I’m wrong or not being objective I resort to facts and data .. and I spend time to research.. but I don’t use stuff that come from people who have an agenda and then select data to support it. I read it and come across it all the time.. but a little bit of knowledge of science and statistics makes it easy to discount the biased selective use of data.
Basically, I think you have preconceived beliefs about “right” and “wrong”.. “good and “bad” which are based on a foundation which I will describe as partisan… U.S. white christian Anglo centric. I don’t use that foundation since it an arbitrary one. .If I’d been born of Chinese parents in China or black parents in Africa I wouldn’t have a U.S. white christian Anglo centric foundation for beliefs… so why should I think these are the “right” or “good” or “best”? It’s arbitrary.
Violent non-fatal) Crime highly correlated with income
Per the Bureau of Justice Statistics, for 2008 – 2012
For the period 2008–12—
-Persons in poor households at or below the Federal Poverty Level (FPL) (39.8 per 1,000) had more than double the rate of violent victimization as persons in high-income households (16.9 per 1,000).
-Persons in poor households had a higher rate of violence involving a firearm (3.5 per 1,000) compared to persons above the FPL (0.8–2.5 per 1,000).
-The overall pattern of poor persons having the highest rates of violent victimization was consistent for both whites and blacks. However, the rate of violent victimization for Hispanics did not vary across poverty levels.
-Poor Hispanics (25.3 per 1,000) had lower rates of violence compared to poor whites (46.4 per 1,000) and poor blacks (43.4 per 1,000).
Poor persons living in urban areas (43.9 per 1,000) had violent victimization rates similar to poor persons living in rural areas (38.8 per 1,000).
-Poor urban blacks (51.3 per 1,000) had rates of violence similar to poor urban whites (56.4 per 1,000).
I only point this out because Mr. Kimel’s regression
1) shows no statistical relation between poverty and Murder rates
2) he implies murder rates reflect crime statistics in general iin his statement::
” If the murder rate parallels the crime rate in general, then the media narrative that illegal immigrants have lower crime rates than the population as a whole is not supported and to some extent contradicted by the data.”
Clearly Mr. Kimel’s implication (using the prefix “if”) is an assumption and assertion that’s full of shit… .where full of shit => completely and wholly contradicts academic analysis by the Department of Justice (among many other studies).
BTW, it took me less than one minute (literally) to find the Bureau of Justice report with one google search “poverty v murder rate”, and another 2-3 minutes to read and comprehend.. It’s the only search I made or tried to see what the relationship was of poverty to crime or murder. I was curious because Mr. Kimel asserted obliquely that murder rates independent variables have similar relationship to or mimic crime rates in general. I doubted this but thought I should find out what the academic studies say about it.
There is a universal general maxim… the greater the distance between actors (lower density) the less the interaction between actors per unit time. Thus, for a constant rate of (some action) between actors at a given level of interaction therefore the lower the rate of (that action) for lower densities, *all other things equal*.
This applies in physics and animals.. all species of non-plant life at least.
To show what you want to show, I would suggest instead of googling for victimization statistics, you google for offender statistics. Yes, most crime is intra and not inter-racial, but that becomes less true for groups that have fewer options about where they live (the urban poor) and the groups most likely to be desirable targets (the urban rich). The suburbs show very different results (Appendix Table 12 of the doc you cited) suggesting that you really, really cannot use victim as proxy for offender.
I left out the words “for this purpose” at the end of the last sentence.
Longtooth, that report looks at victimization rates. That a poor White in the inner city has the same chance of being victimized as does a poor Black means only that the criminals have no racial bias in picking their targets. It says nothing about the criminals who victimize them.
We also cannot jump to the conclusion that poverty causes crime. It is more likely that the same habits, beliefs, and culture that result in bad behavior (crime) also result in bad outcomes (relative poverty).
The literature on determinants of crime is simply ginormous, and I would suggest looking at it before running “kitchen sink” regressions with questionable variables. In any case, one regular finding in many studies over a long period of time (this is not a new topic of study at all) regarding income and crime is that the low income/crime link is stronger for property crime than for other kinds, big surprise.
“I threw it into the hopper.”
Still plenty of stuff you should throw in there. Hit the link with your name; select all; deposit and flush.
How about a preliminary factor analysis or principal component analysis before regression to genrate a set of uncorrelated factors. I suspect some strong correlations among the explanatory variables. Finding a subset of uncorrected factors and then regressing on the factors might show some interesting results. Alternatively you could look at the correlation matrix first and omit one of each pair of variables that are highly correlated. In this way using several different regressions you could obtain stronger evidence for your conclusions.
Also by reducing the number of explanatory variables you would save valuable degrees of freedom for your significance tests.
Welcome to Angry Bear
This guy reminds me of a freshman taking his or her first statistics class. Colinearity, simultaneity, and thus bias never addressed. As Barkley Rosser says “running ‘kitchen sink’ regressions with questionable variables.”
There, I’m done.
Part of the point of throwing in “questionable variables” is to see that the multiple-regression test shows they do not effect the dependent variable.
The problems here go well beyond mere collinearity and simultaneity. The really unpleasant one is endogeneity, although there is more than that as well. There is a reason why this literature is ginormous.
The Boomers drove up the crime rate in the late 60’s-early 90’s “crime boom” by themselves based on population density increase. The actual ‘crime wave” was never as impressive as it was made out to be by the media.
This is a blog. I learned a long time ago not to do anything more complex than a regression on the blog. If I built a self-adaptive algorithm to look for patterns, and then said the algorithm found X and Y and Z, nobody would accept it. Sure, it’s the kind of thing they expect from me at the office, but over here I’d have to spend the next six months explaining the whole thing. So, no go.
Now, perhaps I should have said something, but I did check for multicollinearity, and I did run various versions of the equation in the post that avoided such issues. For the most part (there were a few exceptions) they didn’t produce major changes in the results.
I probably should have mentioned endogeneity, and did run various versions of the model avoiding the issues I could think of in the data. That was more for my own curiosity than anything else, frankly – leaving out Alaska and Hawaii, looking at only specific subsets of the data, killing off variables, etc. For example, the following variables achieve almost the same fit (using adj R2) as the full model:
unauth_pct_of_pop + pct_nonHispanicWhite + pct_nonHispanicBlack + median_income + pop_sq_mi
In this instance, median income is not significant and neither is the unauthorized percentage of the population, although admittedly the latter is not so far off (P value of 17%).
Everything else is significant and in the directions you’d expect if you know the literature. (I don’t, but I have looked at a few papers.)
Now… I get called racist on this very blog more than enough, thank you very much. What, pray tell, do you think the reaction would be at this point to showing a positive and statistically significant (P-value well under 1%) result on the third variable? Well, the commentary would be “education and income matter.” So from the perspective of blogging, I figured there’d be more mileage and getting across more useful information by showing that education and income don’t matter.
Anyway, I do plan on reading a bit more of the literature. But among my many character flaws is the need to check data myself.
I appreciate your comment.
I’re read something along those lines before, but have not checked the data myself.
1) I’m going to go with the whole accedemic litterateur which finds poverty and crime go together rather then your approach.
2) Ill also go with the whole accede mic literature that finds that education goes with lower crime rates.