Preliminary Work: How We Dealt with the Data

Image Reference: 2019 CRIME in the UNITED STATES

Where and How to Find Data

In this project we focused on the violent and property crime rates in the recent twenty years and possible economic factors that may be relate to the rates. We used seven datasets in total in this project:

table-1_edit.xls (Crime in the United States 2019, Table 1): (https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/topic-pages/tables/table-1)

We find the data from the FBI Uniform Crime Reporting website. The data was put together by the UCR program of FBI (“Violent Crime”). The data were collected “for use in law enforcement” and to “[provide] information for students of criminal justice, researchers, the media, and the public” (“Services”).

In this data file, the relevant variables we have are year (from 2000 to 2019), violent crime rate and property crime rate (per 100000 people). In addition, we also have variables like population number, number of people committing each specific category of violent crime and property crime and the corresponding rate for each year.

unemployment_rate_edit.xlsx (Labor Force Statistics from the Current Population Survey): (https://data.bls.gov/timeseries/LNS14000000?years_option=all_years)

We find the data from U.S. Bureau of labor statistics. The data was collected by the government through “Current Population Survey (CPS)”, which is held once a month (“How the Government Measures Unemployment”). The unemployment rates are collected in order to help people who do not have a job (“How the Government Measures Unemployment”).

In this file, we have the unemployment rates in percentage for every month from 2000 to 2019. The unemployment rate = 100 * (number of people who are unemployed) / (number of people who are unemployed + number of people who are unemployed) for those who are not below 16. To make it easier to use, we calculate the average unemployment rate for every year.

current_dollars GDP_edit.xls (SAGDP2N Gross domestic product (GDP) by state 1/Gross domestic product (GDP) by state: All industry total (Millions of current dollars)): (https://apps.bea.gov/itable/iTable.cfm?ReqID=70&step=1)

Like the file above, we find the data on the Bureau of Economic Analysis website. The data is estimated by BEA (“Gross Domestic Product”). The data could help to tell how the economy performs (“Gross Domestic Product”).

In this file, we have the nominal GDP of the US from 2000 to 2019. The variables are year from 2000 to 2019 and under each year name we have the nominal GDP in millions for the US. According to BEA, "“[c]urrent-dollar” or “nominal” GDP estimates“,”[t]he value of the goods and services produced in the United States“,”based on market prices during the period being measured" (“Gross Domestic Product”). Similarly, in order to make the data easier to use, we need to do pivot longer.

GDP_per_capita_by_states_edit.xls (SAINC1 Personal Income Summary: Personal Income, Population, Per Capita Personal Income): (https://apps.bea.gov/itable/iTable.cfm?ReqID=70&step=1)

We found the data on the Bureau of Economic Analysis page. The data was generated by Bureau of Economic Analysis (“Income & Saving”). The income data can suggest how the companies and workers perform (“Income & Saving”). Besides, according to BEA, their income data may not only help people forecast inflation, but also plan for their consumption behavior (“Income & Saving”). Moreover, “[s]tate and local personal income numbers also help the United States allocate hundreds of billions in federal funds to state and local governments each year” (“Income & Saving”).

In this file, we have personal income per capita in current dollar in millions for each state (or DC), and for U.S. from 2000 to 2019. Variables are “GeoName”, and year from “2000” to “2019”. Under each year variable, we have the income per capita for each GeoName. Therefore, in order to use the table more easily, we need to do a pivot longer first.

real_GDP_chained_2012_edit.xls (SAGDP9N)Real GDP by state 1/Real GDP by state: All industry total (Millions of chained 2012 dollars): (https://apps.bea.gov/itable/iTable.cfm?ReqID=70&step=1)

We also find this data on the BEA website. Similarly, the data is estimated by BEA and can tell if the economy is doing a good job or not (“Gross Domestic Product”).

In this data, we have the real GDP of the US in millions from 2000 to 2019 based on 2012. The main variables are also year and under each year name we have the real GDP for the US. According to BEA, "“[r]eal” or “chained” GDP numbers have been adjusted to remove the effects of inflation over time, so different periods can be compared" (“Gross Domestic Product”). We also need to do a pivot longer here to make all the years into one column.

median_income_edit.xls (Real Median Household Income in the United States): (https://fred.stlouisfed.org/series/MEHOINUSA672N)

The data was generated by Census Bureau. According to Census Bureau, income-related data can be used to help, educate and make plans (“Why We Ask Questions About…”). In this data, we have real median income from 1984 to 2019. But we only need data from 2000 to 2019, therefore when we read the table, we only read years from 2000 to 2019. We have two variables, year and real median income.

Gini_coefficient_edit.xlsx (World Development Indicators): (https://databank.worldbank.org/reports.aspx?source=2&series=SI.POV.GINI&country=USA)

We find the data on The World Bank website. According to The World Bank, “The Gini index provides a convenient summary measure of the degree of inequality” and “a Gini index of 0 represents perfect equality, while an index of 100 implies perfect inequality” (“Metadata Glossary”). In the dataset, we mainly have year as the variable (from 2000 to 2018), and under each year there is Gini coefficient of the US. In order to use the dataset, we also need to a pivot_longer to put years in one column.

Description and Process of Cleaning and Loading Data:

The location where we loaded and cleaned up our data is in the static folder in a R file called load_and_clean_data.R, which can be accessed by clicking load_and_clean_data.R. The data we used had a lot of footnotes and rows with details about the data. We had to get rid of those rows and only leave the rows with the data. All of our data were in the format of excel files and R has a package library(readxl) that works with excel files. Using the function read_excel from the package, we read in the files. The function has a parameter called range that limits the data read to a rectangle of cells. Using this parameter, we were able to extract just the data, including any headers, and leave out all the unnecessary data. We were able to do this because the coverage of the data is a rectangle.

Other than removing extra rows, most of the data didn’t need further cleaning. The ones that did need further cleaning were all relatively easy to clean. In dataset table_1_edit, we had to remove footnote numbers in both the data values and the column names that became part of the data and names after being loaded. The values in the data set we cleaned by setting the data in the column row to the value it should be because there were not many data with this error. For the column names we had to use names(dataset)[c(the column we wanted to fix)] <- c(list of names we wanted to set our column names to) (“Rename Data Frame Columns in R”). We had to fix all of the column names, not only removing footnote numbers, but also removing spaces and adding underscores. This was actually easier to do just by setting the names of the columns to the designates names we wanted. Leaving the column names with spaces made them harder to call. We fixed the column names for the dataset Gini_coefficient_edit also using the function names, but we set it to equal to str_sub(column name, beginning, end). Using str_sub, we extract the part of the names we want starting from the beginning index and ending at the end index. For the dataset real_household_median_income_edit, we also used the function names, but we set it equal to a vector of names instead of using str_sub. For this dataset we also used the function format to take out the year from POSIXct form, by setting the first parameter to the data in POSIXct form and then setting the second parameter to “%Y” (MacQueen, “[R] Extract year from date”). The format as its name states formats an R object.

Here we are combining datasets of crime rates in the United States and economic factors we believe to affect those crime rates:

#Joining data
source(here::here("static/load_and_clean_data.R"))

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──

## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.0.5     ✓ dplyr   1.0.3
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.0

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

## New names:
## * `` -> ...1
## * `` -> ...2

unemployment_rates_average <-
  unemployment_rate_edit %>% 
  mutate(average_rates = (Jan + Feb + Mar + Apr +
                            May + Jun + Jul + Aug + 
                            Sep + Oct + Nov + Dec) / 12)
pivot_real_GDP <-
  GDP_real_term_edit %>%
  pivot_longer(c("2000", "2001", "2002", "2003", "2004", 
                 "2005", "2006", "2007", "2008", "2009", 
                 "2010", "2011", "2012", "2013", "2014", 
                 "2015", "2016", "2017", "2018", "2019"),
               names_to = "year", values_to = "real_GDP")

pivot_current_dollar_GDP <-
  GDP_current_dollars_edit %>%
  pivot_longer(c("2000", "2001", "2002", "2003", "2004", 
                 "2005", "2006", "2007", "2008", "2009", 
                 "2010", "2011", "2012", "2013", "2014", 
                 "2015", "2016", "2017", "2018", "2019"), 
               names_to = "year", values_to = "current_dollar_GDP")
pivot_Gini <-
  Gini_coefficient_edit %>%
  pivot_longer(c("2000", "2001", "2002", "2003", "2004", 
                 "2005", "2006", "2007", "2008", "2009", 
                 "2010", "2011", "2012", "2013", "2014", 
                 "2015", "2016", "2017", "2018"), 
               names_to = "year", values_to = "Gini_coefficient")
US_income_per_capita_pivot <- 
   US_income_per_capita_modified %>% 
   pivot_longer(as.character(2000:2019), 
                names_to = "year", values_to = "income_per_capita")

pivot_Gini$year <- 
  as.double(pivot_Gini$year)

pivot_real_GDP$year <- 
  as.double(pivot_real_GDP$year)

pivot_current_dollar_GDP$year <- 
  as.double(pivot_current_dollar_GDP$year)

real_household_median_income_edit$year <- 
  as.double(real_household_median_income_edit$year)

table_1_edit$year <- as.double(table_1_edit$year)

US_income_per_capita_pivot$year <-
   as.double(US_income_per_capita_pivot$year)

join_real <- 
  left_join(table_1_edit, pivot_real_GDP, by = "year")
(join_real)

## # A tibble: 20 x 25
##     year population violent_crime violent_crime_r… murder_and_nonn…
##    <dbl>      <dbl>         <dbl>            <dbl>            <dbl>
##  1  2000  281421906       1425486             506.            15586
##  2  2001  285317559       1439480             504.            16037
##  3  2002  287973924       1423677             494.            16229
##  4  2003  290788976       1383676             476.            16528
##  5  2004  293656842       1360088             463.            16148
##  6  2005  296507061       1390745             469             16740
##  7  2006  299398484       1435123             479.            17309
##  8  2007  301621157       1422970             472.            17128
##  9  2008  304059724       1394461             459.            16465
## 10  2009  307006550       1325896             432.            15399
## 11  2010  309330219       1251248             404.            14722
## 12  2011  311587816       1206005             387.            14661
## 13  2012  313873685       1217057             388.            14856
## 14  2013  316497531       1168298             369.            14319
## 15  2014  318907401       1153022             362.            14164
## 16  2015  320896618       1199310             374.            15883
## 17  2016  323405935       1250162             387.            17413
## 18  2017  325147121       1247917             384.            17294
## 19  2018  326687501       1209997             370.            16374
## 20  2019  328239523       1203808             367.            16425
## # … with 20 more variables: murder_and_nonnegligent_manslaughter_rate <dbl>,
## #   rape_revised_definition <dbl>, rape_revised_definition_rate <dbl>,
## #   rape_legacy_definition <dbl>, rape_legacy_definition_rate <dbl>,
## #   robbery <dbl>, robbery_rate <dbl>, aggravated_assault <dbl>,
## #   aggravated_assault_rate <dbl>, property_crime <dbl>,
## #   property_crime_rate <dbl>, burglary <dbl>, burglary_rate <dbl>,
## #   larceny_theft <dbl>, larceny_theft_rate <dbl>, motor_vehicle_theft <dbl>,
## #   motor_vehicle_theft_rate <dbl>, GeoFips <chr>, GeoName <chr>,
## #   real_GDP <dbl>

join_real_current_dollar <- 
  left_join(join_real, pivot_current_dollar_GDP, by = "year")
(join_real_current_dollar)

## # A tibble: 20 x 28
##     year population violent_crime violent_crime_r… murder_and_nonn…
##    <dbl>      <dbl>         <dbl>            <dbl>            <dbl>
##  1  2000  281421906       1425486             506.            15586
##  2  2001  285317559       1439480             504.            16037
##  3  2002  287973924       1423677             494.            16229
##  4  2003  290788976       1383676             476.            16528
##  5  2004  293656842       1360088             463.            16148
##  6  2005  296507061       1390745             469             16740
##  7  2006  299398484       1435123             479.            17309
##  8  2007  301621157       1422970             472.            17128
##  9  2008  304059724       1394461             459.            16465
## 10  2009  307006550       1325896             432.            15399
## 11  2010  309330219       1251248             404.            14722
## 12  2011  311587816       1206005             387.            14661
## 13  2012  313873685       1217057             388.            14856
## 14  2013  316497531       1168298             369.            14319
## 15  2014  318907401       1153022             362.            14164
## 16  2015  320896618       1199310             374.            15883
## 17  2016  323405935       1250162             387.            17413
## 18  2017  325147121       1247917             384.            17294
## 19  2018  326687501       1209997             370.            16374
## 20  2019  328239523       1203808             367.            16425
## # … with 23 more variables: murder_and_nonnegligent_manslaughter_rate <dbl>,
## #   rape_revised_definition <dbl>, rape_revised_definition_rate <dbl>,
## #   rape_legacy_definition <dbl>, rape_legacy_definition_rate <dbl>,
## #   robbery <dbl>, robbery_rate <dbl>, aggravated_assault <dbl>,
## #   aggravated_assault_rate <dbl>, property_crime <dbl>,
## #   property_crime_rate <dbl>, burglary <dbl>, burglary_rate <dbl>,
## #   larceny_theft <dbl>, larceny_theft_rate <dbl>, motor_vehicle_theft <dbl>,
## #   motor_vehicle_theft_rate <dbl>, GeoFips.x <chr>, GeoName.x <chr>,
## #   real_GDP <dbl>, GeoFips.y <chr>, GeoName.y <chr>, current_dollar_GDP <dbl>

join_real_current_dollar_unemployment <-
  left_join(join_real_current_dollar, unemployment_rates_average, by = "year")
(join_real_current_dollar_unemployment)

## # A tibble: 20 x 41
##     year population violent_crime violent_crime_r… murder_and_nonn…
##    <dbl>      <dbl>         <dbl>            <dbl>            <dbl>
##  1  2000  281421906       1425486             506.            15586
##  2  2001  285317559       1439480             504.            16037
##  3  2002  287973924       1423677             494.            16229
##  4  2003  290788976       1383676             476.            16528
##  5  2004  293656842       1360088             463.            16148
##  6  2005  296507061       1390745             469             16740
##  7  2006  299398484       1435123             479.            17309
##  8  2007  301621157       1422970             472.            17128
##  9  2008  304059724       1394461             459.            16465
## 10  2009  307006550       1325896             432.            15399
## 11  2010  309330219       1251248             404.            14722
## 12  2011  311587816       1206005             387.            14661
## 13  2012  313873685       1217057             388.            14856
## 14  2013  316497531       1168298             369.            14319
## 15  2014  318907401       1153022             362.            14164
## 16  2015  320896618       1199310             374.            15883
## 17  2016  323405935       1250162             387.            17413
## 18  2017  325147121       1247917             384.            17294
## 19  2018  326687501       1209997             370.            16374
## 20  2019  328239523       1203808             367.            16425
## # … with 36 more variables: murder_and_nonnegligent_manslaughter_rate <dbl>,
## #   rape_revised_definition <dbl>, rape_revised_definition_rate <dbl>,
## #   rape_legacy_definition <dbl>, rape_legacy_definition_rate <dbl>,
## #   robbery <dbl>, robbery_rate <dbl>, aggravated_assault <dbl>,
## #   aggravated_assault_rate <dbl>, property_crime <dbl>,
## #   property_crime_rate <dbl>, burglary <dbl>, burglary_rate <dbl>,
## #   larceny_theft <dbl>, larceny_theft_rate <dbl>, motor_vehicle_theft <dbl>,
## #   motor_vehicle_theft_rate <dbl>, GeoFips.x <chr>, GeoName.x <chr>,
## #   real_GDP <dbl>, GeoFips.y <chr>, GeoName.y <chr>, current_dollar_GDP <dbl>,
## #   Jan <dbl>, Feb <dbl>, Mar <dbl>, Apr <dbl>, May <dbl>, Jun <dbl>,
## #   Jul <dbl>, Aug <dbl>, Sep <dbl>, Oct <dbl>, Nov <dbl>, Dec <dbl>,
## #   average_rates <dbl>

join_real_current_dollar_unemployment_median_income <-
  left_join(join_real_current_dollar_unemployment, real_household_median_income_edit, by = "year")
(join_real_current_dollar_unemployment_median_income)

## # A tibble: 20 x 42
##     year population violent_crime violent_crime_r… murder_and_nonn…
##    <dbl>      <dbl>         <dbl>            <dbl>            <dbl>
##  1  2000  281421906       1425486             506.            15586
##  2  2001  285317559       1439480             504.            16037
##  3  2002  287973924       1423677             494.            16229
##  4  2003  290788976       1383676             476.            16528
##  5  2004  293656842       1360088             463.            16148
##  6  2005  296507061       1390745             469             16740
##  7  2006  299398484       1435123             479.            17309
##  8  2007  301621157       1422970             472.            17128
##  9  2008  304059724       1394461             459.            16465
## 10  2009  307006550       1325896             432.            15399
## 11  2010  309330219       1251248             404.            14722
## 12  2011  311587816       1206005             387.            14661
## 13  2012  313873685       1217057             388.            14856
## 14  2013  316497531       1168298             369.            14319
## 15  2014  318907401       1153022             362.            14164
## 16  2015  320896618       1199310             374.            15883
## 17  2016  323405935       1250162             387.            17413
## 18  2017  325147121       1247917             384.            17294
## 19  2018  326687501       1209997             370.            16374
## 20  2019  328239523       1203808             367.            16425
## # … with 37 more variables: murder_and_nonnegligent_manslaughter_rate <dbl>,
## #   rape_revised_definition <dbl>, rape_revised_definition_rate <dbl>,
## #   rape_legacy_definition <dbl>, rape_legacy_definition_rate <dbl>,
## #   robbery <dbl>, robbery_rate <dbl>, aggravated_assault <dbl>,
## #   aggravated_assault_rate <dbl>, property_crime <dbl>,
## #   property_crime_rate <dbl>, burglary <dbl>, burglary_rate <dbl>,
## #   larceny_theft <dbl>, larceny_theft_rate <dbl>, motor_vehicle_theft <dbl>,
## #   motor_vehicle_theft_rate <dbl>, GeoFips.x <chr>, GeoName.x <chr>,
## #   real_GDP <dbl>, GeoFips.y <chr>, GeoName.y <chr>, current_dollar_GDP <dbl>,
## #   Jan <dbl>, Feb <dbl>, Mar <dbl>, Apr <dbl>, May <dbl>, Jun <dbl>,
## #   Jul <dbl>, Aug <dbl>, Sep <dbl>, Oct <dbl>, Nov <dbl>, Dec <dbl>,
## #   average_rates <dbl>, median_income <dbl>

join_all_Gini <-
  left_join(pivot_Gini, join_real_current_dollar_unemployment_median_income, by = "year")
(join_all_Gini)

## # A tibble: 19 x 48
##    `Series Name` `Series Code` `Country Name` `Country Code` `2019`  year
##    <chr>         <chr>         <chr>          <chr>          <chr>  <dbl>
##  1 Gini index (… SI.POV.GINI   United States  USA            <NA>    2000
##  2 Gini index (… SI.POV.GINI   United States  USA            <NA>    2001
##  3 Gini index (… SI.POV.GINI   United States  USA            <NA>    2002
##  4 Gini index (… SI.POV.GINI   United States  USA            <NA>    2003
##  5 Gini index (… SI.POV.GINI   United States  USA            <NA>    2004
##  6 Gini index (… SI.POV.GINI   United States  USA            <NA>    2005
##  7 Gini index (… SI.POV.GINI   United States  USA            <NA>    2006
##  8 Gini index (… SI.POV.GINI   United States  USA            <NA>    2007
##  9 Gini index (… SI.POV.GINI   United States  USA            <NA>    2008
## 10 Gini index (… SI.POV.GINI   United States  USA            <NA>    2009
## 11 Gini index (… SI.POV.GINI   United States  USA            <NA>    2010
## 12 Gini index (… SI.POV.GINI   United States  USA            <NA>    2011
## 13 Gini index (… SI.POV.GINI   United States  USA            <NA>    2012
## 14 Gini index (… SI.POV.GINI   United States  USA            <NA>    2013
## 15 Gini index (… SI.POV.GINI   United States  USA            <NA>    2014
## 16 Gini index (… SI.POV.GINI   United States  USA            <NA>    2015
## 17 Gini index (… SI.POV.GINI   United States  USA            <NA>    2016
## 18 Gini index (… SI.POV.GINI   United States  USA            <NA>    2017
## 19 Gini index (… SI.POV.GINI   United States  USA            <NA>    2018
## # … with 42 more variables: Gini_coefficient <dbl>, population <dbl>,
## #   violent_crime <dbl>, violent_crime_rate <dbl>,
## #   murder_and_nonnegligent_manslaughter <dbl>,
## #   murder_and_nonnegligent_manslaughter_rate <dbl>,
## #   rape_revised_definition <dbl>, rape_revised_definition_rate <dbl>,
## #   rape_legacy_definition <dbl>, rape_legacy_definition_rate <dbl>,
## #   robbery <dbl>, robbery_rate <dbl>, aggravated_assault <dbl>,
## #   aggravated_assault_rate <dbl>, property_crime <dbl>,
## #   property_crime_rate <dbl>, burglary <dbl>, burglary_rate <dbl>,
## #   larceny_theft <dbl>, larceny_theft_rate <dbl>, motor_vehicle_theft <dbl>,
## #   motor_vehicle_theft_rate <dbl>, GeoFips.x <chr>, GeoName.x <chr>,
## #   real_GDP <dbl>, GeoFips.y <chr>, GeoName.y <chr>, current_dollar_GDP <dbl>,
## #   Jan <dbl>, Feb <dbl>, Mar <dbl>, Apr <dbl>, May <dbl>, Jun <dbl>,
## #   Jul <dbl>, Aug <dbl>, Sep <dbl>, Oct <dbl>, Nov <dbl>, Dec <dbl>,
## #   average_rates <dbl>, median_income <dbl>

Before we can join datasets, we first have to be able to access the datasets we have loaded and cleaned in the load_and_clean_data.R file. Since the location of file is in another folder, we have to use source(here::here(“static/load_and_clean_data.R”)). The function source allows inputs like functions and variables from other files to be accessed in the current file. After calling source on the load_and_clean_data.R file, we can now use all of the variables containing the datasets we have cleaned and loaded.

The tool we have used to join datasets is left_join by joining by the column “year”. We joined many datasets to end up with a final dataset that contained violent and property crime rates for each year and all the economic factors we believed to have affected the crime rates. Two data sets were joined at a time, beginning with a base dataset of table_1, which contained crime rates. All other datasets were joined with the previous joined datasets. Finally, we have two large datasets: US_no_Gini and US_with_Gini, we did join the Gini coefficient in the very end since it only contains the data until 2018 while the others contain data until 2019. For US_no_Gini, it contains year, violent crime rate, property crime rate and other economic indicators we need except Gini coefficient, while US_with_Gini contains year, violent crime rate, property crime rate and other economic indicators including Gini coefficient. The reason that we have two datasets here is that we need US_with_Gini to do initial plotting in out Big Picture but we only need US_no_Gini to do further study once we determined that we do not need Gini coefficient for further plottings.

There were changes we had to do to the datasets before they could be joined together. We wanted to join by the year column so we used the function pivot_longer to get all of the years for pivot_real_GDP, pivot_current_dollar_GDP, US_income_per_capita_pivot and pivot_Gini under 1 column because previously each year was its own column. Then we used dataset\(year <- as.double(dataset\)year) to change the type of the year column to a double. The year makes more sense as a double than as a character, which was the type the year column was for part of the datasets. We also wanted all of the year columns to be of the same type. Otherwise, we wouldn’t be able to join by year if the year columns were of different types. The year column in the unemployment_rates_average dataset was capitalized so we renamed it into “year” in the very beginning in the load_and_clean_data.R.

New References:

“Gross Domestic Product.” Bureau of Economic Analysis, https://www.bea.gov/resources/learning-center/what-to-know-gdp. Accessed 24 April 2021.

“How the Government Measures Unemployment.” U.S. Bureau of Labor Statistics, 08 Oct. 2015, https://www.bls.gov/cps/cps_htgm.htm. Accessed 23 April 2021.

“Income & Saving.” Bureau of Economic Analysis, https://www.bea.gov/resources/learning-center/what-to-know-income-saving. Accessed 24 April 2021.

MacQueen, Don. “[R] Extract year from date.” 09 March 2015, https://stat.ethz.ch/pipermail/r-help/2015-March/426643.html. Accessed 08 April 2021.

“Metadata Glossary.” The World Bank, https://databank.worldbank.org/metadataglossary/gender-statistics/series/SI.POV.GINI#:~:text=The%20Gini%20index%20provides%20a,from%20nationally%20representative%20household%20surveys.&text=The%20distribution%20data%20have%20been,per%20capita%20income%20or%20consumption. Accessed 24 April 2021.

“Rename Data Frame Columns in R.” Datanovia, https://www.datanovia.com/en/lessons/rename-data-frame-columns-in-r/. Accessed 08 April 2021.

“Services.” FBI, https://www.fbi.gov/services/cjis/ucr/. Accessed 23 April 2021.

“Violent Crime.” FBI:UCR, https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/topic-pages/violent-crime. Accessed 23 April 2021.

“Why We Ask Questions About…” United States Census Bureau, https://www.census.gov/acs/www/about/why-we-ask-each-question/income/. Accessed 24 April 2021.

“How to Reuse Functions That You Create In Scripts - Source a Function in R.” Earth Data Science - Earth Lab, 22 Feb. 2017, www.earthdatascience.org/courses/earth-analytics/multispectral-remote-sensing-data/source-function-in-R/. Accessed 23 April 2021

“2019 CRIME in the UNITED STATES.” FBI Uniform Crime Reporting, https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019. Accessed 23 April 2021 (header picture)

Previous Analysis behind the Big Picture: From Motivation to Validation

Next The Economy and Crime Rates: Any Relationships?