GraphCast: Learning skillful medium-range global weather forecasting
1. Introduction
全球中程天气预报对于许多社会和经济领域的决策至关重要。传统数值天气预报利用增加计算资源来提高预报准确性,但不能直接使用历史天气数据来改进基础模型。我们介绍了一种名为“GraphCast”的基于机器学习的方法,可以直接从再分析数据中进行训练。它在不到一分钟内以0.25°分辨率全球范围内预测数百个天气变量,在10天内完成。我们表明,GraphCast在1380个验证目标的90%上显著优于最精确的操作确定性系统,并且其预测支持更好地预测严重事件,包括热带气旋、大气河流和极端温度等。 GraphCast是准确和高效的天气预报方面的一个重要进展,并有助于实现机器学习建模复杂动态系统的承诺。
在中程天气预报中,即预测未来10天的大气变量,基于NWP的系统如IFS仍然是最准确的。世界上最顶尖的确定性操作系统是ECMWF的高分辨率预测(HRES),它是IFS的一个组成部分,在约一小时内以0.1°纬度/经度分辨率产生全球10天预报。然而,在过去几年中,得益于WeatherBench等基准测试,MLWP方法用于中期预报已经稳步发展。基于卷积神经网络和Transformer 的深度学习架构已经在粗略纬度/经度分辨率下显示出有希望的结果,并且最近使用图形神经网络(GNN)、傅里叶神经算子和Transformers这些工作已报道了对少数变量和提前时间长达7天时开始接近IFS's 的表现,在1.0°和0.25°处。
2. GraphCast
Here we introduce a new MLWP approach for global medium-range weather forecasting called “GraphCast”, which produces an accurate 10-day forecast in under a minute on a single Google Cloud TPU v4 device, and supports applications including predicting tropical cyclone tracks, atmospheric rivers, and extreme temperatures.
GraphCast takes as input the two most recent states of Earth’s weather—the current time and six hours earlier—and predicts the next state of the weather six hours ahead.
GraphCast is implemented as a neural network architecture, based on GNNs in an “encode-processdecode” configuration, with a total of 36.7 million parameters. Previous GNN-based learned simulators have been very effective at learning the complex dynamics of fluid and other systems modeled by partial differential equations, which supports their suitability for modeling weather dynamics.
The encoder (Figure 1d) uses a single GNN layer to map variables (normalized to zero-mean unit-variance) represented as node attributes on the input grid to learned node attributes on an internal “multi-mesh” representation.
The multi-mesh (Figure 1g) is a graph which is spatially homogeneous, with high spatial resolution over the globe. It is defined by refining a regular icosahedron (12 nodes, 20 faces, 30 edges) iteratively six times, where each refinement divides each triangle into four smaller ones (leading to four times more faces and edges), and reprojecting the nodes onto the sphere. The multi-mesh contains the 40,962 nodes from the highest resolution mesh, and the union of all the edges created in the intermediate graphs, forming a flat hierarchy of edges with varying lengths.
多网格(图1g)是一个空间均匀的图形,在全球范围内具有高空间分辨率。它通过迭代地将正二十面体(12个节点,20个面,30条边)细化六次来定义,其中每次细化将每个三角形分成四个更小的三角形(导致面和边增加了四倍),并重新投影节点到球上。多网格包含最高分辨率网格中的40,962个节点以及所有中间图形创建的所有边缘的联合,形成具有不同长度的平面层次结构。
The processor (Figure 1e) uses 16 unshared GNN layers to perform learned message-passing on the multi-mesh, enabling efficient local and long-range information propagation with few message-passing steps.
The decoder (Figure 1f) maps the final processor layer’s learned features from the multi-mesh representation back to the latitude-longitude grid. It uses a single GNN layer, and predicts the output as a residual update to the most recent input state (with output normalization to achieve unit-variance on the target residual). See Supplements Section 3 for further architectural details.
During model development, we used 39 years (1979–2017) of historical data from ECMWF’s ERA5 [10] reanalysis archive. As a training objective, we averaged the mean squared error (MSE) weighted by vertical level. Error was computed between GraphCast’s predicted state and the corresponding ERA5 state over 𝑁 autoregressive steps. The value of 𝑁 was increased incrementally from 1 to 12 (i.e., six hours to three days) over the course of training. GraphCast was trained to minimize the training objective using gradient descent and backpropagation. Training GraphCast took roughly four weeks on 32 Cloud TPU v4 devices using batch parallelism. See Supplements Section 4 for further training details. Consistent with real deployment scenarios, where future information is not available for model development, we evaluated GraphCast on the held out data from the years 2018 onward (see Supplements Section 5.1).
3. results
这部分先跳过,比较细节的比较了GraphCast与HRES的结果
4. Effect of training data recency
训练数据新旧程序的影响
GraphCast can be re-trained periodically with recent data, which in principle allows it to capture weather patterns that change over time, such as the ENSO cycle and other oscillations, as well as effects of climate change. We trained four variants of GraphCast with data that always began in 1979, but ended in 2017, 2018, 2019, and 2020, respectively (we label the variant ending in 2017 as “GraphCast:<2018”, etc). We compared their performances to HRES on 2021 test data.
5. Conclusion
我们方法的一个关键限制在于如何处理不确定性。我们专注于确定性预测,并与HRES进行比较,但ECMWF IFS的另一个支柱——集合预报系统ENS对于10天以上的预报尤为重要。天气动力学的非线性意味着在更长时间领先时存在越来越大的不确定性,这不能通过单个确定性预测很好地捕捉到。ENS通过生成多个随机预测来解决这个问题,这些随机预测模拟未来天气的经验分布,然而生成多个预测是昂贵的。相比之下,GraphCast 的 MSE 训练目标鼓励它通过空间模糊其预测表达其不确定性,在某些应用中可能会限制其价值。构建更明确地模拟不确定性的系统是至关重要的下一步。
By contrast, GraphCast’s MSE training objective encourages it to express its uncertainty by spatially blurring its predictions, which may limit its value for some applications. Building systems that model uncertainty more explicitly is a crucial next step.
重要的是强调,基于数据驱动的机器学习天气预报(MLWP)在很大程度上取决于通过NWP同化的大量高质量数据,并且像ECMWF的MARS存档这样丰富的数据源是无价之宝。因此,我们的方法不应被视为传统天气预报方法的替代品,这些方法已经发展了几十年,在许多实际情境中进行了严格测试,并提供了许多我们尚未探索过的功能。相反,我们的工作应该被解释为证明MLWP能够满足现实世界预测问题所面临挑战,并具有补充和改进当前最佳方法的潜力。