Background information on the dataset:
library(Lahman)
library(tidyverse)
Additional links or sources:
Analyzing Baseball Data with R, Second Edition (Chapman & Hall/CRC The R Series) by Max Marchi, Jim Albert and Benjamin S. Baumer
library(tidyverse)
<- read_csv("../data/HOF-BATTING.csv")
hof
<- hof %>%
hof mutate(MidCareer = (From + To) / 2,
Era = cut(MidCareer,
breaks = c(1800, 1900, 1919, 1941,
1960, 1976, 1993, 2050),
labels = c("19th Century", "Dead Ball",
"Lively Ball", "Integration",
"Expansion", "Free Agency",
"Long Ball")))
<- summarize(group_by(hof, Era), N = n())
hof_eras hof_eras
# A tibble: 7 × 2
Era N
<fct> <int>
1 19th Century 19
2 Dead Ball 19
3 Lively Ball 46
4 Integration 24
5 Expansion 21
6 Free Agency 19
7 Long Ball 8
This dataset is broken down by Hall of Fame baseball players that played in seven different eras of baseball titled with different nicknames for each era.
For example, the “19th Century” grouping is of players that played before the 1900s. But the Lahman dataset end with the “Long Ball” era was everything after the 1993 MLB season.
In this bar graph, this is showing the number of non-pitcher Hall of Famers that were selected in their specific era of baseball.
The bar graphs show that both the “19th Century” and “Dead Ball” era had a total of 19 players, the “Lively Ball” era having the most with 46 players and so forth.
Like the bar graph, this dot plot also shows the amount of non-pitchers in their era of baseball that they played in.
This dot plot is helpful to some viewers and readers because when having a larger amount categories it helps show and represents the eras in a clearer fashion.