Access to clean fresh water is one of the most valuable resources to humans. Although lake water accounts for less than 1% of the world’s water supply, it plays an essential role in human activities including for drinking water, transportation, recreation, and food production. Maintaining water quality in lakes is thus critical for sustaining human life on earth. However, the water quality of many lakes has degraded because of human development, agriculture production, and climate change. A database on water quality for lakes worldwide can facilitate our understanding of the role of multiple environmental stressors on water quality in lakes, provide data for environmental monitoring or management programs, and serve as a baseline prior to further environmental degradation.
Chlorophyll is commonly used to evaluate water quality and primary production in aquatic systems. The concentration of chlorophyll in water can be used as a proxy for other measures of lake health. Using chlorophyll as the focus, we assembled a team of highly dedicated undergraduate and graduate students, in addition to post-doctoral researchers at York University. We systematically reviewed 3322 published studies and scoured the web for online repositories of lake monitoring. Each researcher carefully read 500 papers over an academic year, extracting data from the text, figures, tables, or by contacting the original authors. Together, we extracted 228,168 unique chlorophyll values for 11,959 lakes, across 72 countries. For many of these observations, we were also able to extract water chemistry variables, such as phosphorus or nitrogen, and lake morphometric variables, including lake surface area and mean depth.

A map of all lakes included in the database on water quality for lakes and the median chlorophyll concentrations recorded for that lake.
Most of the lakes in the database are found in the United States and Europe, but we did have coverage on all continents including Antarctica. Lake records of chlorophyll went as far back as 1933, although most observations were more recent (>2000). Almost half of all observations included data about phosphorus, a significant driver of chlorophyll concentrations and an indication of nutrient inputs within the watershed. All observations also included geospatial coordinates that allow for integration with remote sensed or reanalysis data, including climate and land cover. This database has been invaluable for us to discern the role of nutrient inputs, land use changes, and climate on primary production in lakes.
Our open-access database of water quality is available through The Knowledge Network for Biocomplexity. We also provide all code for the data manipulation and analyses in generating the database at Github. We encourage researchers, policy makers, and lake managers to utilize this database to promote our collective understanding of water quality in freshwater lakes locally, regionally, and worldwide.

Please sign in or register for FREE
If you are a registered user on Research Data at Springer Nature, please sign in