CPEG 2025 Workshop

Building Open Data Science Skills in Paleobiology and Ecology

July 27th, 2025
09:00-17:00

University of Zürich (Zürich, Switzerland)

Welcome

R is one of the most popular languages in the world of data science and has been widely adopted by the paleobiological and ecological communities for data analysis. General familiarity with R allows users to automate routine tasks, create reproducible analytical workflows, and expand the potential of their research. This workshop aims to introduce participants to the versatility of R for cleaning, analyzing, and visualizing paleobiological and ecological data. It will cover topics including: data acquisition from biodiversity databases such the Paleobiology Database, Neotoma Paleoecology Database, and Global Biodiversity Information Facility; building workflows in R to clean and analyze data; data visualization and synthesis; and guidelines (e.g., FAIR and CARE principles) and tools (e.g., GitHub) for creating more reproducible code and accessible documentation. In doing so, this workshop will introduce attendees to palaeoverse, an R package that supports data preparation and exploration for paleobiological analysis. Participants will also become familiar with additional packages developed by Palaeoverse www.palaeoverse.org, such as rphylopic. More broadly, this event aims to connect participants working in different fields who share a common interest in data science and provide a platform for participants to gain experience working collaboratively in R to generate reproducible interdisciplinary research.

You can find additional information about the workshop here.

Arrival

The event starts at 09:00 on Sunday, July 27th, 2025 and will take place in room KO2-F-152 at the Natural History Museum, University of Zürich Central Campus.

If you need additional assistance getting set-up, we’ll be in the room starting at 08:00.

Schedule

Time Event
08:00 Installation assistance (optional)
09:00 Welcome and introduction to the Palaeoverse
09:15 Setting up a reproducible workflow
10:30 Coffee break
10:45 Data acquisition
11:15 Data Processing I: Data exploration and cleaning
12:30 Lunch break 🥪
14:00 Data Processing II: Data harmonization and synthesis
15:30 Coffee break
15:45 Data archiving and publishing
16:45 Questions and wrap-up

Instructors

This edition of the Workshop event is organised and led by the following members of the Palaeoverse team.

Bethany Allen
GFZ Helmholtz Centre for Geosciences, Germany

Chris Dean
University College London, UK

Erin Dillon
Smithsonian Tropical Research Institute, Panama

Harriet Drage
University of Lausanne, Switzerland

Will Gearty
Syracuse University, USA

Pedro Godoy
University of São Paulo, Brazil

Niklas Hohmann
Utrecht University, Netherlands

Lewis Jones
University College London, UK

Installation

Please ensure that you have the latest version of R for the workshop, which can be downloaded here for Windows or here for MacOS. We also recommend installing the latest version of RStudio, which can be downloaded here. To minimize any installation issues during the workshop, please also install the following R packages:

install.packages("dplyr")
install.packages("readr")
install.packages("palaeoverse")
install.packages("ggplot2")
install.packages("rnaturalearth")
install.packages("rnaturalearthdata")
install.packages("deeptime")
install.packages("rgplates")
install.packages("fossilbrush")
install.packages("rgbif")
install.packages("raster")
install.packages("sf")
install.packages("geodata")
install.packages("ncdf4")

GitHub

This workshop will include a brief introduction to GitHub. In advance of the workshop, we recommend that you make a GitHub account, if you don’t already have one, so you can follow along during the workshop. Note that you can get additional benefits if you apply to GitHub Education as a researcher or student once you’ve set up a personal account.

Additionally, you should have GitHub Desktop installed on your computer. You can sign into GitHub Desktop with your GitHub account.

Acknowledgements

This event is run by the Palaeoverse and is supported by the Crossing the Paleontological-Ecological Gap and Conservation Paleobiology Symposium and the University of Zürich. As a community-driven initiative, we’d also like to thank all of the participants for their support and engagement with the Palaeoverse.