Introducing the Global Lake area, Climate, and Population (GLCP) dataset

The Global Lake area, Climate, and Popualtion (GLCP) dataset is a collection of annual lake surface area for over 1.42 million lakes with co-located basin-level climate and population data from 1995 through 2015.
Published in Research Data
Introducing the Global Lake area, Climate, and Population (GLCP) dataset
Like

We are pleased to share with the larger research community the Global Lake area, Climate, and Population (GLCP) dataset: a collection of annual surface areas for over 1.42 million lakes with co-located basin-level climate and population data from 1995 through 2015. The dataset and its workflow are publicly available and the authors encourage future users to both use and expand upon the core dataset. We hope that in publishing our data and workflow we can fast track analyses for future users that might otherwise proceed slowly due to limitations in time and computing power needed to create a global dataset of this nature.

New data collection technologies and a movement towards data sharing have enabled large syntheses across many scientific disciplines. In the case of aquatic research, remote sensing has created unprecedented opportunities to track global and decadal changes in freshwater abundance. Combined with freely available datasets on global climate or human population, such data can provide meaningful context for observed environmental changes. Yet despite the availability and accessibility of these datasets, research is often hindered by the technical and computational hurdles required to integrate data sources into a single, analytically-friendly format. 

We encountered this first-hand when, in pursuit of a global, decadal-scale analysis of lake areas, we realized that the effort to create a dataset for such an analysis was itself a multi-year endeavor. We were fortunate that we had both the funded research time and access to high performance computing resources to create the dataset we needed. However, we recognize that not all researchers have this luxury. Therefore, we are proud to publicly share our new dataset, the Global Lake area, Climate, and Population (GLCP) dataset. The GLCP harmonizes global-scale aquatic, climate, and population data so that researchers with a broader range of access to computing resources and data wrangling capacity can leverage the rich data sources available to assess lake changes over the past 20 years. 

This data product stands on the shoulders of giants: the GLCP harmonizes existing global-scale datasets for: lake (HydroLAKES, Joint Research Centre (JRC) Global Surface Water Dataset) and lake basin (HydroBASINS), climate (Modern-Era Retrospective analysis for Research and Applications, MERRA-2), and population (Gridded Population of the World) variables. The resulting dataset is a compilation of surface area for over 1.42 million lakes and reservoirs from 1995 to 2015 with co-located basin-level temperature, precipitation, and population data. The GLCP is intended to incorporate FAIR (findable, accessible, interoperable, reusable) principles and retains original identifiers from HydroLAKES and HydroBASINS to streamline reintegration with parent and other related datasets. 

We believe that the dataset as well as the project’s workflow, script repositories, and documentation is broadly applicable to many environmental researchers: natural resource managers, local and regional agencies, non-profits, and even citizen-scientists. With respect to the dataset itself, we foresee a wide range of possible applications for the GLCP from local-to-global analyses. At the local scale, researchers can pair data collected in situ from their specific lake with the GLCP to see how permanent, seasonal, or total water surface area may relate to other environmental or ecological variables at that site. At larger scales, researchers can use the GLCP to investigate regional and global changes in water quantity, which has the potential to inform policies for human consumption, agriculture, and other aquatic-related issues. 

Along with lake, climate, and population variables, the full GLCP data package is available through the Environmental Data Initiative and contains all Google Earth Engine and R scripts used to create the GLCP. By providing scripts and example file architectures, we hope that future users will be empowered to utilize our workflow and expand the dataset’s contents with various remote sensing, climate, and socio-ecological variables. 

If you have any questions about the dataset, please feel free to email or tweet at the authors:

Michael F. Meyer (@mishafredmeyer)

Stephanie G. Labou (@stephlabou)

Alli N. Cramer (@AlliNCramer)

Matthew R. Brousil (@mrbrousil)

Bradley T. Luff

This project was funded in part by a NSF GRF to MFM (DGE-1347973). The poster photo is taken from Lake Baikal (Siberia) by Michael F. Meyer.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Subscribe to the Topic

Research Data
Research Communities > Community > Research Data

Related Collections

With collections, you can get published faster and increase your visibility.

Remote sensing data for changes in land use

This Collection comprises a series of articles presenting data on changes to land use in urban areas, farmland, forests, and natural environments, as determined using remote sensing techniques.

Publishing Model: Open Access

Deadline: Jan 31, 2024

Medical imaging data for digital diagnostics

This Collection presents a series of articles describing annotated datasets of medical images and video. All medical specialities are considered and data can be derived from study participants, tissue samples, electronic health records (EHRs) or other sources.

Publishing Model: Open Access

Deadline: Dec 20, 2023