Crime Incident Analysis for 2024¶
This notebook performs an exploratory data analysis on crime incidents reported in 2024. The analysis aims to identify patterns, insights, and potential anomalies within the dataset. It includes visualizations of crime types, shifts, methods, and geographical distribution to understand crime trends.
# Import Libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# Load Data
df = pd.read_csv('crime_incidents_in_2024.csv')
Offense Type Distribution¶
# Visualize Offense Types
plt.figure(figsize=(12, 6))
sns.countplot(y='OFFENSE', data=df, order=df['OFFENSE'].value_counts().index)
plt.title('Distribution of Offense Types')
plt.xlabel('Number of Incidents')
plt.ylabel('Offense Type')
plt.show()
Crime Incidents by Shift¶
# Visualize Crime Incidents by Shift
plt.figure(figsize=(8, 5))
sns.countplot(x='SHIFT', data=df, order=['DAY', 'EVENING', 'MIDNIGHT'])
plt.title('Crime Incidents by Shift')
plt.xlabel('Shift')
plt.ylabel('Number of Incidents')
plt.show()
Crime Methods Analysis¶
# Visualize Crime Methods
plt.figure(figsize=(10, 5))
sns.countplot(x='METHOD', data=df, order=df['METHOD'].value_counts().index)
plt.title('Crime Methods Used')
plt.xlabel('Method')
plt.ylabel('Number of Incidents')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()
Homicide Incidents Using Guns¶
# Analyze Homicide Incidents with Gun Method
homicide_gun_incidents = df[(df['OFFENSE'] == 'HOMICIDE') & (df['METHOD'] == 'GUN')]
homicide_gun_count = len(homicide_gun_incidents)
print(f'Number of Homicide Incidents Using Guns: {homicide_gun_count}')
Number of Homicide Incidents Using Guns: 151
Crime Distribution by Neighborhood Cluster¶
# Visualize Crime Distribution by Neighborhood Cluster
plt.figure(figsize=(12, 6))
cluster_order = df['NEIGHBORHOOD_CLUSTER'].value_counts().nlargest(10).index # Top 10 clusters
sns.countplot(y='NEIGHBORHOOD_CLUSTER', data=df, order=cluster_order)
plt.title('Crime Incidents by Neighborhood Cluster (Top 10)')
plt.xlabel('Number of Incidents')
plt.ylabel('Neighborhood Cluster')
plt.show()
Theft Offenses by Ward¶
# Analyze Theft Offenses by Ward
theft_offenses = df[df['OFFENSE'].str.contains('THEFT', case=False)]
plt.figure(figsize=(10, 5))
sns.countplot(x='WARD', data=theft_offenses, order=sorted(df['WARD'].dropna().unique()))
plt.title('Theft Offenses Distribution Across Wards')
plt.xlabel('Ward')
plt.ylabel('Number of Theft Incidents')
plt.show()
Insights from the Analysis¶
- Theft/Other and Theft F/Auto are the most frequent offenses, indicating property crime is a significant issue.
- The 'DAY' and 'EVENING' shifts report a higher number of incidents compared to 'MIDNIGHT', suggesting crime is more prevalent during daylight and evening hours.
- The 'OTHERS' method category is overwhelmingly dominant, suggesting non-violent and non-weapon-specific methods are most common, or it may indicate insufficient specificity in reporting methods.
- Homicide, while less frequent than theft, is present and committed using 'GUN' method, highlighting violent crime incidents.
- Clusters 21, 25, and 8 appear to have a higher frequency of reported incidents compared to other clusters.
- Theft from auto and theft/other are prevalent across various wards, suggesting a widespread nature of these property crimes.
Anomalies Identified¶
- Several incidents have 'START_DATE' and 'END_DATE' in prior years to 2024, suggesting potential data entry errors or inconsistencies.
- Some entries have missing values in 'BID', indicating potential incompleteness in Business Improvement District data.
- The broad categorization of 'THEFT/OTHER' may obscure more specific types of theft, reducing analytical granularity.
- The prevalence of 'OTHERS' as the crime method could limit detailed method-specific insights due to lack of specificity.
Predictions¶
Status: No predictions possible
Reason: Insufficient data for time-series prediction. The data covers a limited period in 2024 without historical context necessary for trend analysis or seasonality. Predictive models require more extensive temporal data to reliably forecast future crime incidents.
Summary¶
Analysis of the 2024 crime incident data reveals a predominance of theft-related offenses, particularly Theft/Other and Theft F/Auto. Incidents are more frequent during 'DAY' and 'EVENING' shifts. Specific neighborhood clusters and wards appear to experience higher crime rates. Anomalies include historical dates in 'START_DATE' and 'END_DATE' columns, missing 'BID' values, and the very broad categorization of 'THEFT/OTHER' and 'OTHERS' crime methods which may reduce granularity of insights. Predictive analysis is not feasible due to the limited timeframe of the 2024 data and absence of historical trends.