Developing the China Meteorological Forcing Dataset

A brief history of the birth of China Meteorological Forcing Data.
Developing the China Meteorological Forcing Dataset

The development of China Meteorological Forcing Dataset (CMFD) can be traced back to the end of the year 2008, when there was no dataset specifically developed for land processes studies over China but many land surface modelers interpolate station data for each basin/region of concern. Meanwhile, we needed such a data set to drive our land data assimilation system for whole China. Although there were a few global datasets available for land process studies, scientific communities were often not satisfied with them for their low resolution and systematic biases over China. In this situation, we decided to develop a high-resolution dataset over China with higher precision. Then, a graduate student in our group, Jie He, took the responsibility, although his initial ambition was to contribute to atmospheric dynamics.

Like many other similar works, the development of CMFD is not easy. Skills that used for processing small data files did not work well when we tried to process large amount of data. Without engineering background, we tried to write a data-generating system from the very beginning. However, for several times we found the data-generating system has fatal deficiencies so we had to reconstruct the whole system. After about three years, the data-generating system was finally established and has been used until today. The algorithm for generating CMFD is not complex, since we believe the amount and quality of input information are the key factor to determine the quality of the CMFD (i.e. you can't generate useful information out of nothing). We spent too much time on input data quality control as the input data from various sources (station data, background data) often contain unexpected outliers. This has been demonstrated to be the most important step to stabilize the data quality. We thank all users; some of them reported problems in the data and helped us much to improve the data quality.

We are proud of the CMFD product. It has been used in about 300 articles although we did not promote it much among research communities. We are proud of the development of the CMFD product, although we only have this publication related to its development over the past eleven years.

The development of CMFD will never stop. A new version of CMFD with new algorithm, new input datasets, and new data-generating system is under conception now. We expect that a dataset with reliable trend of time series, broader spatial coverage, and higher precision will come within next few years.

Please sign in or register for FREE

If you are a registered user on Research Data at Springer Nature, please sign in