An introduction to data visualization

by Erin Dillon & Broc Kokesh
Adapted from materials from William Gearty

2025 Paleontological Society Short Course

Learning objectives

  • Why data visualization is important
  • Strategies for effectively visualizing data
  • How to create basic graph types in R
  • Example code to modify

Schedule


1:30 - 2:00 PM: Theory of data visualization

2:00 - 3:15 PM: Data visualization practical

Theory of data visualization

  • What is data visualization?
  • Picking a graph type
  • Basic design principles
  • Visualization in R

What is data visualization?

What is data visualization?


The representation of data through the use of graphs and figures.

Often the data are complex, but the representation should be approachable and easy-to-understand.

Let’s look at some examples

For example, here are some data:
(even this is a data visualization)

Let’s look at some examples

And here is a data visualization of that data:

Let’s look at some examples

How would you improve this figure?

What makes a bad scientific figure?

  • too much data shown (split into multiple panels)
  • improper choice of graph type (avoid 3D plots)
  • no clear message
  • information not laid out in a clear way
  • axes with no (or inappropriate) labels or units
  • misleading portrayal of scale and uncertainty (e.g., truncated axes)
  • no legend
  • inconsistent colors and formatting, cluttered appearance
  • not visually accessible, too many colors, lacking contrast
  • colors/symbols not used to highlight relationships or patterns
  • no readable text hierarchy
  • poor resolution

Picking a graph type

Graph type depends on data type

Plotly

https://plotly.com/r/

Basic design principles

Basic design principles

  • Visual encoding of data points
  • Color
  • Type (Font)
  • Visual hierarchy
  • Composition and layout (R helps with this)

Visual encoding

Marks and channels to visually represent groups in the data:

Color

Lots of colors exist!

Color

Picking colors

  • Value (lightness/darkness)

Color

Picking colors

  • Temperature (cool vs warm)

Color

Picking colors

  • Saturation (intensity of color)

Color

Picking colors

  • Palettes (collections of colors that work well together)

Color

Picking colors

  • Palettes (collections of colors that work well together)

Color

Declaring colors in R

Color

Getting HEX codes

Color

Or use premade color palette packages!

Color

Color accessibility is important!

The more color contrast the better!

Color

Don’t forget about color blindness!

Check your graphs with a color blindness simulator (e.g., Coblis color blindness simulator or ColorOracle).

Color

Your graphs should even be legible in grayscale


Color

Effective use of color

Color

Effective use of color

Font

There are lots of fonts/types, too!

Font

We want easily readable fonts:

Font

Tips

  • One font is usually enough (max two)
  • Make sure font size is big enough (who is the audience?)
  • Use bold for emphasis, but avoid italics and underlines
  • Left-aligned text is most readable
  • Use size as a tool to hierarchize content
  • Use 1.5 line spacing for better readability

Visual hierarchy

What makes a figure effective?

  • tells a story with the data
  • patterns are represented in a clear, honest, and intuitive way
  • uses an appropriate graph type
  • graph design is accessible and emphasizes the main message (minimal clutter)
  • aesthetics are consistent
  • visual hierarchy that draws the eye to the most important parts of the graph
  • includes labels for context

Visualization in R

Visualization in R

R has lots of built-in visualization functionality:

  • plot()
  • barplot()
  • hist()
  • boxplot()
  • axis()
  • legend()
  • lines(), segments(), rect(), text(), etc.

Visualization in R

Visualization in R

Many packages build on these ‘base’ graphics:

  • {plotrix}
  • {rgl}: for 3D interactive graphics
  • {gplots}
  • {scatterplot3d}: 3D scatterplots
  • {palaeoverse}!

Visualization in R

Then ‘grid’ graphics came along:

  • More complex layouts
  • Scaling is maintained on resizing
  • Nested graphs and more interactivity

Visualization in R

Many packages build on these ‘grid’ graphics:

  • {lattice}: trellis graphics
  • {vcd}: for categorical data
  • {ggplot2}: “grammar of graphics”
  • {hexbin}: hexagonal bins
  • {patchwork}: combine (ggplot2) plots
  • {deeptime}!

Visualization in R

Packages for spatial visualization:

  • {sf}: basic objects and methods for vector data
  • {terra}: basic objects and methods for raster data
  • {ggplot2}: works for plotting spatial data, too
  • {raster}: plotting raster data

Visualization in R

File formats:

You can export plots from R in many file formats (we’ll mostly use ggsave()):

File format Image type Notes
.jpg Raster Can’t have transparent parts
.png Raster Can also have transparent parts
.svg Vector Can edit in Inkscape/Illustrator
.pdf Vector Not actually an image file type, but can be used

Data visualization books

The Visual Display of Quantitative Information
Edward R. Tufte

Better Data Visualizations
Jonathan Schwabish

Fundamentals of Data Visualization
Claus O. Wilke

Building Science Graphics
Jen Christiansen

R Graphics Cookbook
Winston Chang

Data visualization practical