Ensemble-size-dependence of deep-learning post-processing methods that minimize an (un)fair score: motivating examples and a proof-of-concept solution

每日信息看板 · 2026-02-17

返回当天 Daily Index

研究/论文

AI 总结

该论文揭示最小化aCRPS的部分后处理与深度学习方法会因成员依赖导致集合规模敏感和过度离散，并提出轨迹Transformer在不同训练与推理集合规模下仍能稳健提升或保持预报可靠性。

aCRPS在成员可交换且条件独立时对集合规模公平无偏，但结构化成员依赖会破坏这一前提。
作者用线性逐成员校准与基于集合维自注意力的深度学习两种方法展示：表面aCRPS改进可能伴随系统性不可靠（过度离散）。
提出trajectory transformers：将PoET思路改为沿提前期做自注意力，避免在集合维引入依赖，从而保留aCRPS所需条件独立性。
在ECMWF次季节系统的周平均2米气温预报上，该方法可减小系统偏差，并在不同训练规模（3/9）和实时规模（9/100）下保持或提升可靠性。

#arXiv #paper #研究/论文 #aCRPS #Transformer #ECMWF

原链接

内容摘录

Fair scores reward ensemble forecast members that behave like samples from the same distribution as the verifying observations. They are therefore an attractive choice as loss functions to train data-driven ensemble forecasts or post-processing methods when large training ensembles are either unavailable or computationally prohibitive. The adjusted continuous ranked probability score (aCRPS) is fair and unbiased with respect to ensemble size, provided forecast members are exchangeable and interpretable as conditionally independent draws from an underlying predictive distribution. However, distribution-aware post-processing methods that introduce structural dependency between members can violate this assumption, rendering aCRPS unfair. We demonstrate this effect using two approaches designed to minimize the expected aCRPS of a finite ensemble: (1) a linear member-by-member calibration, which couples members through a common dependency on the sample ensemble mean, and (2) a deep-learning method, which couples members via transformer self-attention across the ensemble dimension. In both cases, the results are sensitive to ensemble size and apparent gains in aCRPS can correspond to systematic unreliability characterized by over-dispersion. We introduce trajectory transformers as a proof-of-concept that ensemble-size independence can be achieved. This approach is an adaptation of the Post-processing Ensembles with Transformers (PoET) framework and applies self-attention over lead time while preserving the conditional independence required by aCRPS. When applied to weekly mean $T_{2m}$ forecasts from the ECMWF subseasonal forecasting system, this approach successfully reduces systematic model biases whilst also improving or maintaining forecast reliability regardless of the ensemble size used in training (3 vs 9 members) or real-time forecasts (9 vs 100 members).