Background Info

Background information on the dataset:

library(Lahman)
library(tidyverse)

Additional links or sources:

Analyzing Baseball Data with R, Second Edition (Chapman & Hall/CRC The R Series) by Max Marchi, Jim Albert and Benjamin S. Baumer

https://www.amazon.com/Analyzing-Baseball-Data-Second-Chapman-dp-0815353510/dp/0815353510/ref=dp_ob_title_bk

Data Used from Lahman

library(tidyverse)
hof <- read_csv("../data/HOF-BATTING.csv")

hof <- hof %>%
  mutate(MidCareer = (From + To) / 2,
         Era = cut(MidCareer,
                   breaks = c(1800, 1900, 1919, 1941,
                              1960, 1976, 1993, 2050),
                   labels = c("19th Century", "Dead Ball",
                              "Lively Ball", "Integration",
                              "Expansion", "Free Agency",
                              "Long Ball")))

hof_eras <- summarize(group_by(hof, Era), N = n())
hof_eras
# A tibble: 7 × 2
  Era              N
  <fct>        <int>
1 19th Century    19
2 Dead Ball       19
3 Lively Ball     46
4 Integration     24
5 Expansion       21
6 Free Agency     19
7 Long Ball        8

This dataset is broken down by Hall of Fame baseball players that played in seven different eras of baseball titled with different nicknames for each era.

For example, the “19th Century” grouping is of players that played before the 1900s. But the Lahman dataset end with the “Long Ball” era was everything after the 1993 MLB season.

Bar Plot of the Era the Hall of Famers Played in


In this bar graph, this is showing the number of non-pitcher Hall of Famers that were selected in their specific era of baseball.

The bar graphs show that both the “19th Century” and “Dead Ball” era had a total of 19 players, the “Lively Ball” era having the most with 46 players and so forth.

Scatter Plot of the Hall of Famers Era


Like the bar graph, this dot plot also shows the amount of non-pitchers in their era of baseball that they played in.

This dot plot is helpful to some viewers and readers because when having a larger amount categories it helps show and represents the eras in a clearer fashion.