Behind the hourly 5-km surface total and diffuse solar radiation dataset in China (2007-2018)

Written by Hou Jiang and Ning Lu
Behind the hourly 5-km surface total and diffuse solar radiation dataset in China (2007-2018)

As the main source of the earth's energy, solar radiation drives the cycle of material in different spheres of the earth, such as carbon, nitrogen, water, oxygen and so on. Every living thing on Earth depends on sunlight for survival. Solar radiation warms the planet, provides food for plants, and in general, just makes everything active. In everyday life, for example, you may travel to the Qinghai-Tibet Plateau or Mongolian Plateau; if you have a good knowledge of the spatial and temporal characteristics of solar radiation in these areas, you could have a rational arrange of your schedule to avoid exposing to the sun, or prepare protective equipment in advance. As an environmentalist, you might plan installing solar panels on your rooftops and want to know how much power the rooftop can generate and whether the electricity output is stable. With solar radiation data of high spatial-temporal resolution on hand, you will quickly get answers of these questions.

Our team is committed to the assessment and utilization of solar energy. First of all, we have to obtain high-accuracy, high-resolution and sometimes long-term surface solar radiation data. However, the accuracy of the current available products is very limited to meet the requirements of high-resolution geospatial assessment of solar photovoltaic potential. We evaluated some typical products, such as ECMWF ERA5, ERA-Interim and MERRA-2, and found that in China large deviations usually appear in the south, where frequent cloudy and rainy weathers usually result in more diffuse solar radiation. Our analysis suggested that neglect of adjacency effect in traditional one-dimensional radiation transfer model (RTM) accounts for the under-estimation in the Southern China. Spatial adjacency effect is caused by photons which are reflected by the surface out of the field of view and scattered by the atmosphere into the field of view. Ideally, three-dimensional RTMs are effective in simulating nonlocal cloud shadows, reflections from cloud sides, and enhancement of diffuse radiation, and thus resolving spatial adjacency effects.

However, time-consuming three-dimensional simulation cannot meet the requirements of fast acquisition of globally covered surface solar radiation based on satellite data. Therefore, we tried to develop a new algorithm that efficiently retrieve surface solar radiation from satellite data and is capable of handling spatial adjacency effects. Fortunately, the convolution neural network (CNN) provides us a promising solution.

Next, it cost us about two years to develop an effective and efficient CNN-based deep network for solar radiation estimates. We did not realize such work would be so difficult until we had tried nearly all classical deep networks proposed in the field of natural image processing but still not found an ideal model. We began to explore other potential solutions, including adjusting the deep network structure, embedding residual blocks or multi-scale convolution module, increasing or decreasing the number of convolution kernels, deepening or widening the network, trying various input-output combinations etc. This process was a big test of our patience and perseverance, because any small adjustment of the network would take at least half a day to wait for the result. Sometimes, you designed a model or adjusted a parameter, and then trained the new network for a day or several days, but the output was far off your expectations; at this time, you really wanted to give up and even suspected that the idea itself was wrong and cannot succeed at all.

After trial and errors, we built a simple but effective hybrid model, namely ResnetTL, which relies on residual deep network to extract spatial pattern from satellite image blocks and multi-layer perceptron to link time/location information and spatial pattern to target outputs. Such network was purely an accidental success of constant attempts. However, in-depth investigation reveals that a simple model also has magic power, for example, it accelerated the computation, improved data accuracy, achieved the synchronization of integrating spatial information and simulating nonlinear relationship, providing bridge for estimates of other solar radiation components through transfer learning and so on.

Based on the developed model, we have generated and released the hourly 5-km surface total and diffuse solar radiation dataset in China (2007-2018). It provides gridded surface global and diffuse solar radiation in China with a spatial resolution of 0.05°. Both the direct estimated hourly values and the integrated daily and monthly total values are available. The figures below show the diurnal variability of hourly global solar radiation on June 21, 2018. Validation at China Meteorological Administration solar radiation stations revealed that the root-mean-square error in 2007 is about 0.32 MJ/m2 (90 W/m2), 2.14 MJ/m2 and 1.30 MJ/m2 in hourly, daily and monthly scales, respectively.

Diurnal variability of hourly global solar radiation from UTC 0:00 to 11:00, June 21, 2018.

Our datasets can be reused as stand-alone for analysis of regional characteristics and temporal trend of solar radiation, yet richer studies and applications can be done by linking to other data resources. We suggest the open-source Global Solar Energy Estimator (GSEE) model for accurate estimation of solar energy in China to help policy-making of energy sector. Besides, it can also be used to drive plant models (e.g., JULES, YIB, SWAP etc.) for crop yield estimation.

In the near future, we are going to carry out solar energy related applications based on the released product. We also hope that our products would serve as a key data source in support of scientific researches and industrial applications in this field and bring real convenience or value to users. Meanwhile, our work keeps going on and new ideas such as convolutional long short-term memory network will be introduced into the developed network to resolve the lag effect and cumulative effect in time series during instantaneous solar radiation estimation.