ECMWF’s Year of Polar Prediction Dataset – a Reference for Polar Weather and Climate Research

Climate research and operational weather forecasting need reference datasets that allow detailed process analysis and Earth-system model evaluation, in particular in regions that are very sensitive to climate change and where combining models with observations is particularly challenging.

Polar areas are receiving an enormous amount of attention right now because climate change symptoms are stronger than at lower latitudes, as exemplified by the increasing loss of sea ice every summer. Under the leadership of Prof. Thomas Jung, the Polar Prediction Project (PPP) of the World Meteorologic Organization (WMO) and its Year of Polar Prediction (YOPP) created the nucleus for a vast range of internationally coordinated research activities aiming to understand the role of the poles in global climate and how its future evolution can be more reliably predicted with numerical models.

Already several years ago, the PPP Steering Group considered to create a unique analysis and forecast dataset that monitors the state of the coupled system along several seasons and that augments major field campaigns, such as the “once in a lifetime” MOSAiC expedition (2019–2020), and other numerical experiments that contributed to YOPP. In fact, the Steering Group noted nine years and one day ago: “Ideally, any new special datasets should be of sufficient length to allow systematic investigation of forecast quality […] and be openly accessible and sustainable […].”

The dataset was produced with the world leading European Centre for Medium-Range Weather Forecasts (ECMWF) operational global weather prediction system – but with a twist: In addition to the entire model output on atmosphere, ocean and sea ice we decided to add specific information on the individual physical processes so that scientists can investigate how the system’s state is affected by critical processes such as dynamic transport, radiation or cloud microphysics respectively. In previous research experiments this twist had already proven to provide insights into the complex interplay between processes but also revealed key sources of model error, which need to be overcome for more reliable predictions of future evolution. As the dataset contains global simulations it can even be used to answer research questions for other regions of the world.

The technical creation of the dataset was clearly a challenge because by the time of the publication of our paper, the data volume has reached over half a Petabyte and contained over 330 million fields for variables like temperature or sea-ice cover. Data with this dimension could not possibly be hosted on generic data sites but needed a dedicated platform at ECMWF. Secondly, the YOPP dataset became one of the first ever being published with the CC-BY-4.0 license and is part of ECMWF’s new strategy towards fully open data.

Having passed both technical and licensing challenges, we expect the YOPP dataset to spawn novel research ideas, generate more funding opportunities and promote inspiring masters and PhD work. Its set-up will serve as a template for future datasets to come, and – given the emergency of responsibly dealing with climate change – there is no doubt that future datasets of this kind will indeed be produced helping to disentangle one of the greatest puzzles of our time.

The YOPP Nature Scientific Data paper is an open access data descriptor.