School Shooting Data Analysis

I came across this interesting dataset on Kaggle on U.S. school shootings from 1990 to 2022. I decided to poke around in the set and see if I could find any trends. Since it was amassed from multiple sources, there were some duplicate entries that I removed. Then I filtered out instances at colleges, so I would be left with only K-12 data. Then I filtered out instances that did not result in any fatalities and visualized the results.

Fatalities remain fairly consistent over time: between 2 and 37, with a mean average of 13. These numbers thankfully are quite small in relation to the 49,400,000 U.S. students. Of course, each school shooting is horrifically tragic, but it is a statistically rare occurrence. News Media outlets focus on school shootings when they do happen, creating a false sense that they occur at a much higher rate (this is the principle of Cultivation Theory, that because the news focuses disproportionately on negative incidents, people are led to cultivate a disproportionate view of how often these negative events occur). To put the odds in perspective, I made a chart showing the likelihood of a U.S. student dying in a school shooting, compared to the odds of getting struck by lightning. As you can see, you are more than twice as likely to be struck by lightning.

While I was at it, I also broke the data down by state and city. It seems Texas and California have had the highest number of fatalities with 53 and 52, respectively. In Texas, cities near their southern border have been hit the hardest, with the remainder concentrated around the Dallas area. California also has pockets of increased incidents, near Hollywood, Belmont, and Sacramento. The city with the largest number of fatalities however is Newtown, Connecticut with 28. Newtown represents an outlier though, as this where the infamous Sandy Hook Elementary School shooting took place, and this single incident is responsible for all 28 fatalities.