Beautiful Tables with gt

A few words on the gt package

With gt you can construct a wide variety of useful tables with a cohesive set of table parts. Here is the main workflow to transform a data frame into a beautiful table:

https://gt.rstudio.com/reference/figures/gt_workflow_diagram.svg

An gt object as various elements. These include the table header, the stub, the column labels and spanner column labels, the table body, and the table footer.

https://gt.rstudio.com/reference/figures/gt_parts_of_a_table.svg

Checkout the gt website for more information: https://gt.rstudio.com/articles/gt.html

Our example

We are going to reproduce some of the tables of this paper Seasonal Changes in Plankton Food Web Structure and Carbon Dioxide Flux from Southern California Reservoirs https://doi.org/10.1371/journal.pone.0140464 and create new ones using the data archived on Dryad https://doi.org/10.5061/dryad.6tn4h.

Load the libraries

# install.packages("gt")
library(gt)
library(tidyverse)

Load the data

data <- read.csv("data/Adamczyk_PLOSdata.csv")

Have a look at the data

data |> glimpse()
Rows: 150
Columns: 24
$ year                                <int> 2013, 2013, 2013, 2013, 2013, 2013…
$ month                               <int> 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7…
$ day                                 <int> 26, 26, 26, 3, 3, 3, 10, 10, 10, 1…
$ julian_date                         <int> 13177, 13177, 13177, 13184, 13184,…
$ sampling_date                       <int> 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4…
$ lake                                <chr> "Murray", "Miramar", "Poway", "Mur…
$ h2o_temp_c                          <dbl> 26.11111, 24.44444, 25.55556, 25.8…
$ air_temp_c                          <dbl> 27.22222, 30.50000, 32.38889, 24.1…
$ wind_speed_kph                      <dbl> 6.437376, 9.656064, 6.437376, 10.4…
$ ph                                  <dbl> 8.85, 8.35, 8.99, 8.78, 8.45, 9.10…
$ total_alkalinity_ppm                <int> 196, 203, 172, 261, 271, 240, 91, …
$ chlorophyll_a_ugL                   <dbl> 4.30, 2.54, 1.03, 5.10, 2.29, 2.01…
$ salinity_ppt                        <dbl> 0.3796530, 0.3787916, 0.3497159, 0…
$ pco2_water_ppm                      <dbl> 173.438, 582.045, 106.694, 273.342…
$ bacteria_number_per_L               <int> NA, 14204, 18078, 10330, 18078, 14…
$ total_nitrogen_mgL                  <dbl> 3.170, 2.530, 1.630, 3.620, 3.120,…
$ total_phosphorus._mgL               <dbl> 0.01720, 0.00837, 0.00868, 0.02150…
$ dissolved_organic_carbon_mgL        <dbl> 4.450445, 2.726669, 3.482022, 5.21…
$ particulate_organic_nitrogen_mgL    <dbl> 0.06960000, NA, 0.02266667, 0.0696…
$ particulate_organic_carbon_mgL      <dbl> 0.30894545, NA, 0.07393939, 0.3427…
$ co2_flux_mmol_m_day                 <dbl> -2.039, 3.738, -2.646, -2.987, 6.5…
$ pco2_atmosphere_ppm                 <dbl> 399.06, 399.06, 399.06, NA, NA, NA…
$ zooplankton_community_avg_legnth_mm <dbl> 0.2245817, 0.7466187, 0.8807347, 0…
$ zooplankton_community_biomass_mgL   <dbl> 0.9987292, 4.9275362, 9.3057214, 3…

Let’s have a look at the first rows of our dataset, but this time we are going to use the gt package

data |> 
  head() |>
  gt()
year month day julian_date sampling_date lake h2o_temp_c air_temp_c wind_speed_kph ph total_alkalinity_ppm chlorophyll_a_ugL salinity_ppt pco2_water_ppm bacteria_number_per_L total_nitrogen_mgL total_phosphorus._mgL dissolved_organic_carbon_mgL particulate_organic_nitrogen_mgL particulate_organic_carbon_mgL co2_flux_mmol_m_day pco2_atmosphere_ppm zooplankton_community_avg_legnth_mm zooplankton_community_biomass_mgL
2013 6 26 13177 1 Murray 26.11111 27.22222 6.437376 8.85 196 4.30 0.3796530 173.438 NA 3.17 0.01720 4.450445 0.06960000 0.30894545 -2.039 399.06 0.2245817 0.9987292
2013 6 26 13177 1 Miramar 24.44444 30.50000 9.656064 8.35 203 2.54 0.3787916 582.045 14204 2.53 0.00837 2.726669 NA NA 3.738 399.06 0.7466187 4.9275362
2013 6 26 13177 1 Poway 25.55556 32.38889 6.437376 8.99 172 1.03 0.3497159 106.694 18078 1.63 0.00868 3.482022 0.02266667 0.07393939 -2.646 399.06 0.8807347 9.3057214
2013 7 3 13184 2 Murray 25.83333 24.16667 10.460736 8.78 261 5.10 0.2849283 273.342 10330 3.62 0.02150 5.212524 0.06964706 0.34271658 -2.987 NA 0.3035989 3.4809318
2013 7 3 13184 2 Miramar 26.94444 25.22222 11.265408 8.45 271 2.29 0.1741368 635.311 18078 3.12 0.02190 3.404073 0.03226667 0.25940606 6.549 NA 0.5070355 4.7456874
2013 7 3 13184 2 Poway 22.11111 27.16667 0.804672 9.10 240 2.01 0.3805866 108.509 14204 2.53 0.02830 5.087733 0.01880000 0.16833939 -0.042 NA 0.5773348 3.2551055

A simple table

Now let’s create a summary table of the average air temperature for each basin

data_air <- data |>
  select(year, lake, air_temp_c) |>
  group_by(year, lake) |>
    summarise(average_air_temp = mean(air_temp_c))

data_air
# A tibble: 6 × 3
# Groups:   year [2]
   year lake    average_air_temp
  <int> <chr>              <dbl>
1  2013 Miramar             24.9
2  2013 Murray              24.4
3  2013 Poway               27.4
4  2014 Miramar             20.6
5  2014 Murray              21.1
6  2014 Poway               20.9

Now let’s make this a nice table

gt_air <-
  data_air |>
  gt() 

# Show the gt Table
gt_air
lake average_air_temp
2013
Miramar 24.92844
Murray 24.35022
Poway 27.36000
2014
Miramar 20.55600
Murray 21.10280
Poway 20.87200

Add a Title

gt_air |> 
  tab_header(
    title = "Average measurements per lake and year",
    subtitle = "major reservoirs in San Diego County"
  ) 
Average measurements per lake and year
major reservoirs in San Diego County
lake average_air_temp
2013
Miramar 24.92844
Murray 24.35022
Poway 27.36000
2014
Miramar 20.55600
Murray 21.10280
Poway 20.87200

Working with a more complex table

data_measurements <- data |>
  select(year, lake, air_temp_c, pco2_atmosphere_ppm, h2o_temp_c, ph, pco2_water_ppm) |>
  group_by(year, lake) |>
    summarise(avg_air_temp = mean(air_temp_c),
              avg_pco2_atmosphere_ppm = mean(pco2_atmosphere_ppm, na.rm=TRUE),
              avg_h2o_temp_c = mean(h2o_temp_c, na.rm=TRUE),
              avg_ph = mean(ph, na.rm=TRUE),
              avg_pco2_water_ppm = mean(pco2_water_ppm, na.rm=TRUE)
    )

data_measurements |>  gt()
lake avg_air_temp avg_pco2_atmosphere_ppm avg_h2o_temp_c avg_ph avg_pco2_water_ppm
2013
Miramar 24.92844 395.0941 23.94800 8.4264 353.8672
Murray 24.35022 395.0905 23.47222 8.6096 218.0030
Poway 27.36000 395.1236 23.34178 8.7964 126.2890
2014
Miramar 20.55600 399.6909 19.46750 8.2524 335.5639
Murray 21.10280 399.6973 20.25739 8.4504 245.7339
Poway 20.87200 399.7452 19.07800 8.5016 172.2162

Add better column names

gt_measurements <- data_measurements |>
  gt() |>
  cols_label(
    avg_air_temp = "Air Temp (C)",
    avg_pco2_atmosphere_ppm = "Air pC02 (ppm)",
    avg_h2o_temp_c = "Water Temp (C)",
    avg_ph = 'Water pH',
    avg_pco2_water_ppm = "Water pC02 (ppm)"
  )


# Show the gt Table
gt_measurements
lake Air Temp (C) Air pC02 (ppm) Water Temp (C) Water pH Water pC02 (ppm)
2013
Miramar 24.92844 395.0941 23.94800 8.4264 353.8672
Murray 24.35022 395.0905 23.47222 8.6096 218.0030
Poway 27.36000 395.1236 23.34178 8.7964 126.2890
2014
Miramar 20.55600 399.6909 19.46750 8.2524 335.5639
Murray 21.10280 399.6973 20.25739 8.4504 245.7339
Poway 20.87200 399.7452 19.07800 8.5016 172.2162

Adding headers for multiple columns

gt_measurements <- gt_measurements |>
  tab_spanner(
    label = md("**Air**"),
    columns = c(avg_air_temp, avg_pco2_atmosphere_ppm)
  ) |>
  tab_spanner(
    label = md("**Water**"),
    columns = c(avg_h2o_temp_c, avg_ph, avg_pco2_water_ppm)
  ) |>
  cols_label(
    avg_air_temp = "Temp (C)",
    avg_pco2_atmosphere_ppm = "pC02 (ppm)",
    avg_h2o_temp_c = "Temp (C)",
    avg_ph = 'pH',
    avg_pco2_water_ppm = "pC02 (ppm)"
  )

# Show the gt Table
gt_measurements
lake
Air
Water
Temp (C) pC02 (ppm) Temp (C) pH pC02 (ppm)
2013
Miramar 24.92844 395.0941 23.94800 8.4264 353.8672
Murray 24.35022 395.0905 23.47222 8.6096 218.0030
Poway 27.36000 395.1236 23.34178 8.7964 126.2890
2014
Miramar 20.55600 399.6909 19.46750 8.2524 335.5639
Murray 21.10280 399.6973 20.25739 8.4504 245.7339
Poway 20.87200 399.7452 19.07800 8.5016 172.2162

Limiting values to 2 decimals

gt_measurements <- gt_measurements |>
    fmt_number(
    columns = everything(),
    decimals = 2,
    use_seps = FALSE
  )


# Show the gt Table
gt_measurements
lake
Air
Water
Temp (C) pC02 (ppm) Temp (C) pH pC02 (ppm)
2013
Miramar 24.93 395.09 23.95 8.43 353.87
Murray 24.35 395.09 23.47 8.61 218.00
Poway 27.36 395.12 23.34 8.80 126.29
2014
Miramar 20.56 399.69 19.47 8.25 335.56
Murray 21.10 399.70 20.26 8.45 245.73
Poway 20.87 399.75 19.08 8.50 172.22

Put it all together

Add title and Footnote

gt_measurements <- gt_measurements |>
  tab_header(
    title = "Average measurements per lake and year",
    subtitle = "major reservoirs in San Diego County"
  ) |>
 tab_source_note(
    source_note = md("Source: _Adamczyk EM, Shurin JB (2015) Seasonal Changes in Plankton Food Web Structure and Carbon Dioxide Flux from Southern California Reservoirs. PLoS ONE 10(10): e0140464. <https://doi.org/10.1371/journal.pone.0140464>_")
  )

# Show the gt Table
gt_measurements
Average measurements per lake and year
major reservoirs in San Diego County
lake
Air
Water
Temp (C) pC02 (ppm) Temp (C) pH pC02 (ppm)
2013
Miramar 24.93 395.09 23.95 8.43 353.87
Murray 24.35 395.09 23.47 8.61 218.00
Poway 27.36 395.12 23.34 8.80 126.29
2014
Miramar 20.56 399.69 19.47 8.25 335.56
Murray 21.10 399.70 20.26 8.45 245.73
Poway 20.87 399.75 19.08 8.50 172.22
Source: Adamczyk EM, Shurin JB (2015) Seasonal Changes in Plankton Food Web Structure and Carbon Dioxide Flux from Southern California Reservoirs. PLoS ONE 10(10): e0140464. https://doi.org/10.1371/journal.pone.0140464

Styling

gt_measurements |> opt_stylize(style = 6, color = 'gray')
Average measurements per lake and year
major reservoirs in San Diego County
lake
Air
Water
Temp (C) pC02 (ppm) Temp (C) pH pC02 (ppm)
2013
Miramar 24.93 395.09 23.95 8.43 353.87
Murray 24.35 395.09 23.47 8.61 218.00
Poway 27.36 395.12 23.34 8.80 126.29
2014
Miramar 20.56 399.69 19.47 8.25 335.56
Murray 21.10 399.70 20.26 8.45 245.73
Poway 20.87 399.75 19.08 8.50 172.22
Source: Adamczyk EM, Shurin JB (2015) Seasonal Changes in Plankton Food Web Structure and Carbon Dioxide Flux from Southern California Reservoirs. PLoS ONE 10(10): e0140464. https://doi.org/10.1371/journal.pone.0140464

There are 36 combinations of style and color to choose from: https://gt.rstudio.com/reference/opt_stylize.html?q=opt_stylize#examples

And of course you can do conditional cell formatting: https://gt.rstudio.com/reference/tab_style.html