Deriving the four winter weather regimes
Why look at regimes?
Regimes are quasi-stationary, persistent large-scale patterns that have an effect on surface weather, which in turn is associated with the generation of renewable energy and the electricity demand (temperature, wind speed, presence or absence of clouds). They add structure at the sub-seasonal range and support consistent discussions of risk and planning. Their identification is sensitive to analysis choices, so methodological consistency is important. For more information on weather regimes, refer to the ECMWF Newsletter 165 [1].
Scope and data
- Dataset: ERA5 reanalysis
- Region: 90°W–30°E, 20–80°N (Euro-Atlantic region)
- Time period: Winters from 1979 to 2018
- Temporal and spatial resolution: Daily data at 1° resolution
- Field: 500 hPa geopotential height, which may need to be transformed from geopotential (the variable available in ERA5) by dividing it by the acceleration of gravity at surface (other studies also use 700 hPa or MSLP)
Method
- Pre-processing: Compute daily anomalies (remove winter mean) and apply latitude-cosine weighting
- PCA: Reduce dimensionality; keep 14 components to preserve close to 90% of variance and suppress small-scale noise
- Clustering: k-means with k=4, initialized from the first four PCs assigns each day to its nearest cluster
- Back-projection: Invert PCA and remove the latitude weighting to obtain regime centers in regular map space
Outputs
- Regime maps: Four cluster-center anomaly fields describing typical large-scale flow states
- Daily diagnostics: Distances from each day to each regime center; useful as a “strength of match” metric. Often, the only factor considered is which one is the closest cluster to the daily average for any given day
Some factors to consider
- Results depend on domain, season, variable choice, and number of regimes; these should be kept fixed if comparisons are being made
- PCA reduces computational cost and filters smaller, non-repeating patterns, but also smooths local features; PCA may not be suitable for cases where local features are of interest
- Clusters summarize variability, but no single day will look like the cluster mean
Minimal code skeleton (for reproducibility)
import xarray as xr
import numpy as np
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
# Assume z500 is an xarray.DataArray of the 500hPa geopotential height with dimensions time, latitude, longitude
z500 = z500 * np.cos(np.radians(z500['latitude']))
z500 = z500 - z500.mean(dim='time')
z500_2d = z500.stack(space=('longitude','latitude'))
pca = PCA(n_components=14)
Z = pca.fit_transform(z500_2d)
kmeans = KMeans(n_clusters=4, init=[
[1,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0,1,0,0,0,0,0,0,0,0,0,0,0,0],
[0,0,1,0,0,0,0,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0,0,0,0,0,0,0]], verbose=1)
dist = kmeans.fit_transform(Z)
centers_geo = pca.inverse_transform(kmeans.cluster_centers_)
da_distance_to_centers = xr.DataArray(dist,
coords={'time':z500_2d['time'], 'cluster':np.arange(4)})
da_cluster_centers = xr.DataArray(centers_geo,
coords={'cluster':np.arange(4), 'space':z500_2d['space']}).unstack()
da_cluster_centers =
da_cluster_centers/np.cos(np.radians(da_cluster_centers['latitude']))
# The output da_distance_to_centers is an xarray.DataArray that contains the distance from the z500 of each days to each of the cluster centers
# The output da_cluster_centers is an xarray.DataArray that contains the cluster centers for each of the four clusters
Sample view of the internal python workflow
References
- ECMWF Newsletter 165 (Autumn 2020): How to make use of weather regimes for extended-range predictions in Europe. https://www.ecmwf.int/en/newsletter/165/meteorology/how-make-use-weather-regimes-extended-range-predictions-europe ↩︎
- van der Wiel, K., Bloomfield, H. C., Lee, R. W., Stoop, L. P., Blackport, R., Screen, J. A., & Selten, F. M. (2019). The influence of weather regimes on European renewable energy production and demand. Environmental Research Letters, 14(9), 094010. DOI: https://doi.org/10.1088/1748-9326/ab38d3 ↩︎