Novel statistical model pieces together biodiversity puzzle
June 7, 2020
Thousands of species are in decline worldwide and thousands more are of unknown status, possibly going extinct unnoticed, according to the International Union for the Conservation of Nature.
But as species dwindle, data about them are on the rise—from citizen science projects like iNaturalist to technologically advanced data collection sites like the National Science Foundation’s National Ecological Observation Network (NEON)—and could help scientists keep pace with record changes in biodiversity.
Elise Zipkin, associate professor in MSU’s Department of Integrative Biology and the Ecology, Evolution, and Behavior Program, will use a 3-year, $782,676 NSF grant to unite diverse data sources like these into a novel and flexible statistical modeling framework, what she calls an Integrated Community Model, with the aim of assessing the status, trends and dynamics of biodiversity.
“We cannot ignore the growing amount of opportunistic citizen science data, but it comes with huge challenges—no design, no randomization—all the things the scientific community knows are important for making an inference beyond the area of study,” explained Zipkin, whose grant was awarded by NSF’s Division of Biological Infrastructure. “My lab is asking: what can we do with this wealth of data? How can we make these data valuable for basic and applied research?”
The Zipkin Lab is well positioned to develop the ground breaking methodology. For the last six years, they have specialized in estimating how the abundance and distribution of both single species and whole communities of species are affected by climate and environmental change.
“Our lab has done quite a bit of work developing approaches that integrate data for single species,” said Zipkin, who used multiple data sources from Mexico, the U.S. and Southern Canada to evaluate the factors influencing monarch butterfly declines. “Simultaneously, we’ve also been working on community modeling approaches where we analyze multi-species data that comes from a single source, such as a transect surveys that record all bird species encountered.”
Integrating the sheer amount of biodiversity data across the many available sources is finally possible thanks to high-performance computers, but designing a framework flexible enough to analyze high volume data with diverse structures is a challenge, especially one that will work across many different kinds of organisms.
Imagine trying to piece together a jigsaw puzzle with millions of individual pieces coming from thousands of different boxes, and fast, before some of the rarer pieces disappear.
“Data integration is the future,” Zipkin said. “There are many species with unknown conservation status, so approaches that allow researchers to simultaneously analyze all available data can go a long way towards rapid, accurate assessments.”