install.packages()
GSA 2025 Workshop
Open Science, Collaboration, and Reproducibility in Paleontology
Saturday, October 18, 2025
8:00 AM - 5:30 PM
Henry B. González Convention Center
Welcome
Since the development of large paleontological datasets from the 1970s onwards, paleontologists have increasingly adopted computational approaches to address questions about the history of life on Earth. This initiated a “Golden Age” of paleontology, where extensive datasets of various formats are used to test macroevolutionary and macroecological hypotheses. In parallel, the broader scientific community has been pushing for science to become more transparent, equitable, collaborative, and reproducible under the umbrella of “Open Science”. This culminated in 2023 being designated as the “Year of Open Science” by the White House Office of Science and Technology Policy. This short course will bridge these two movements to introduce the tenets of Open Science and how they can be incorporated into existing and future paleontological research workflows. First, we will provide an introduction to collaboration, version control, and data storage via services including Git, GitHub, Zenodo, and FigShare. We will then build upon this foundation by exploring the R programming language. R is one of the most popular languages in the world of data science and has been widely adopted by the paleontological community to clean, analyze, and plot data. General familiarity with R allows users to expand the potential of their research and automate routine tasks. We will introduce a suite of R packages that have been designed to standardize and streamline various parts of paleontological workflows (e.g., data cleaning). As part of this, we will briefly introduce existing paleontological databases (e.g., Paleobiology Database) and how they can be accessed from within the R framework. Finally, we will discuss the use of visualizations and how they can be efficiently and effectively developed to increase the transparency and equitability of paleontological research. This short course will provide a great opportunity for attendees to work with different researchers and gain experience working collaboratively in R to generate reproducible research. Further, we hope that this short course will bring the community together to share resources, reach agreed standards, and improve reproducibility in paleontological research. We anticipate this short course will be of value to paleontologists of all career stages.
Arrival
The event starts at XXX on the XXX and will take place at XXX.
Schedule
Time | Event |
---|---|
08:00 AM | Welcome and introduction |
08:30 AM | Setting up a reproducible workflow (GitHub, GitHub Desktop, RStudio) |
10:00 AM | Coffee break☕ |
10:15 AM | Data acquisition |
10:45 AM | Data Processing I: Data exploration and cleaning |
12:00 PM | Lunch break 🥪 |
1:30 PM | Data Processing II: Data visualization and synthesis |
3:15 PM | Coffee break☕ |
3:30 PM | Open science presentation/breakout (reporting, archiving, and publishing) |
4:45 PM | Closing remarks |
Instructors
This edition of the Workshop event is organised and led by the following members and friends of the Palaeoverse team.
Syracuse University, USA
Smithsonian Tropical Research Institute
Stanford University
University of California, Berkeley
Virginia Tech
Installation
Please ensure that you have the latest version of R for the workshop, which can be downloaded here. We also recommend installing the latest version of RStudio, which can be downloaded here. To minimize any installation issues during the workshop, please also install the following R packages:
Acknowledgements
This event is run by the Palaeoverse and supported by the Paleontological Society.