Monday, 20 October 2025

Why Integrating More Solar Farms in Ireland Matters

Why Integrating More Solar Farms in Ireland Matters

As countries are trying to replace fossil fuels with renewable energy, they are hitting a major problem: sometimes the sun does not shine, and the wind does not blow at the same time. These periods of low renewable generation, sometimes named "renewable energy droughts" or "Dunkleflaute", could leave the power grid vulnerable during critical times.

These droughts are not rare weather events. They happen regularly, and power system operators need to plan for them carefully. The more Ireland depends on wind and solar, the more crucial it becomes to predict when these droughts will occur, how long they will last, and how severe they might get.

What We Did

To understand renewable energy droughts better, we used 45 years of historical weather data to model how much wind and solar power would have been generated across Ireland if a certain amount of wind and solar farms were installed. Energy scientists typically use general models built from data across all of Europe to answer these types of questions.

But we wondered: how different would Ireland's results be if we used a model specifically tailored to Irish weather and using actual farm locations? We compared both approaches and then analysed two scenarios: Ireland's current situation (where wind power makes up 91% of renewable capacity and solar power just 9%, as shown in the image below), and a more balanced future scenario (with roughly equal wind and solar capacity).

Map of wind (green) and solar (orange) farms in the Republic of Ireland and Northern Ireland. Most of the wind farms are located in the western part of the country, whereas solar farms appear more on the Eastern side. The size of the dots represent the maximum amount of electricity that can be generated by each farm.

What we found

Our tailored Irish model revealed significantly more renewable energy droughts than the generic European model. This gap reveals a critical point: if power planners use the wrong model, they could seriously underestimate how vulnerable the grid is during drought periods.

When we modelled a transition to a more balanced energy mix with much more solar capacity, the results were striking. The number of renewable energy drought events dropped by roughly 50%. A five-day drought that occurs every six months under today's wind-heavy system would happen only once every four years in a more diversified system.

This improvement happens because wind and solar are naturally complementary. In Ireland and much of Northern Europe, wind tends to be stronger during winter, while solar generation peaks during summer. By mixing both sources, the energy system produces more consistent power throughout the year.


The chart above shows the monthly percentage of hours when renewable generation falls to dangerous levels (red: tailored model, purple: generic model), comparing Ireland's current wind-dominated system (left) with a more balanced scenario (right). Notice how the peaks smooth out and the overall percentage of problematic hours drops significantly.

If you are interested in understanding more about this, feel free to read the article available online.

Tuesday, 14 October 2025

SOM for winter weather regimes

Deriving four winter weather regimes with Self-Organising Maps (SOM)

Summary. Self-Organising Maps (SOM) are a simple neural-network method used here to extract winter weather regimes by grouping daily circulation patterns over the Euro-Atlantic region. The approach is straightforward to apply, but results are sensitive to a few key choices (domain, variables, normalisation, map size, and training parameters). Below we explain how SOM work, and apply it to the identification of the four winter weather regimes.

Why weather regimes matter

Weather regimes are recurrent, persistent large-scale flow patterns. They evolve more slowly than day-to-day weather, which makes them carry predictive value at week-ahead timescales and tie directly to surface impacts, such as renewable energy generation or electricity demand (refer to [1] for more details). A general understanding of these regimes can, for instance, help with the planning and management of electricity system.

What are Self-Organising Maps?

Self-Organising Maps (SOM, [2]) are constituted by a 2D grid of “neurons”. Each of these neurons is initialised with a vector of the same dimension as the input (in our case, the 500 hPa geopotential height anomaly). Then, for each input pattern the SOM is trained by following two steps:

  1. Find the Best Matching Unit (BMU): the BMU is the neuron which is closest to x (usually calculated through the Euclidean distance).
  2. Update the BMU and its neighbours: all neurons are updated based on the new sample considered. How much they are updated depends on two factors: the learning rate, and the distance to the BMU. The update is strongest for the BMU, and other neighbours are updated depending on their distance to it.

This process is repeated over the entire dataset, where the neurons become more representative of the data and nearby neurons represent similar patterns, preserving topology. After training, the stardard weather regime treatment can be applied:

  • A weather regime through the time series of BMUs
  • The patterns for weather regimes are identified from the mean of all fields assigned to a BMU

Important note: The specifics of the configuration of the SOM are inherent to the specific result obtained. Parameters such as grid size (in our case, 2×2 to target four regimes), initialisation (random vs. PCA), neighbourhood radius and how it decreases with iterations, learning rate and how it decreases with iterations, and total iterations. Contrarily to other algorithms, increasing the number of iterations does not improve the SOM results, and it actually leads to overfitting. The specific setup for all of these parameters is key to the final result. For further information on the application of SOM, refer to [3].

Data and domain

  • Dataset: ERA5 reanalysis
  • Variable: 500 hPa geopotential height (z500)
  • Region: 90°W–30°E, 20°–80°N
  • Period: Winters 1991–2020 (daily means)
  • Grid: 1° spacing (natively downloaded)

Pre-processing

  1. Area weighting: multiply each grid point by the cosine of its to balance area toward the poles.
  2. Anomalies: subtract the winter-mean at each grid point.
  3. Reshape: stack the 2-D field to a 1-D vector per day (as per algorithm requirements).
  4. Standardise: z-score each grid point over time (to avoid biases in SOM).

SOM setup

When setting up the SOM, a few parameters need to be chosen. The size of the SOM is determined by how many neurons (equivalent to clusters in this case) we are aiming to get (four for the winter weather regimes). However, many other parameters determine the specifics of the SOM. Here is an overview, and our selection for each:

  • sigma - initial nighbourhood radius, it decays to 0 with iterations
  • learning rate - rate at which the neurons are updated, decays to 0 with iterations
  • neighbourhood function - how the influence of a sample decays with the distance from its BMU
  • number of iterations - total number of iterations, in our case 5000 iterations account for roughly three passes of the whole dataset

Note that a random seed was used to ensure that results can be reproduced. 

Code: from ERA5 to a trained 2×2 SOM

The code in this example is structured for use with the minimal SOM package MiniSOM [4]

import xarray as xr
import numpy as np
import pandas as pd
from minisom import MiniSom
from sklearn.preprocessing import StandardScaler

# Assume da_z500 contains the winter data for z500 (time, latitude, longitude) for the entire domain and subtract the anomaly
da_anom = da_z500 - da_z500.mean(dim='time')
# Calculate area weighting
da_weighted = da_anom * np.cos(np.radians(da_anom['latitude']))

# Reshape the data to [samples, features]
X = da_weighted.stack(points=("latitude","longitude")).transpose("time","points").values

# Standardise features across time
scaler = StandardScaler(with_mean=True, with_std=True)
X_std = scaler.fit_transform(X)

# Train 2x2 SOM (four regimes)
som = MiniSom(x=2, y=2, input_len=X_std.shape[1], sigma=1.0, learning_rate=0.5,
              neighborhood_function='gaussian', random_seed=42)
som.random_weights_init(X_std)
som.train_random(X_std, num_iteration=5000)

# Assign each day to its BMU
bmus = np.array([som.winner(x) for x in X_std])  # array of (i,j) pairs
labels = np.ravel_multi_index((bmus[:,0], bmus[:,1]), dims=(2,2))  # 0..3

# The average field of each BMU can be recovered from the mean of the fields with each label 
 

Alternative workflow: large SOM for mapping + k-means to 4 regimes

A more common use for SOM is to use it as a dimensionality reduction tool that maintains topology. Therefore, something that has been done in the literature is to run a large SOM (such as 10×10 or 20×20, see [5] for an example) to discretise the state space. Then, the neurons are used as input to k-means to get four clusters. Each day inherits its regime via its BMU’s cluster.

Some additions to the code are required for this application: 

from sklearn.cluster import KMeans

# Train 10x10 SOM
som_big = MiniSom(x=10, y=10, input_len=X_std.shape[1], sigma=1.0, learning_rate=0.5,
                  neighborhood_function='gaussian', random_seed=42)
som_big.random_weights_init(X_std)
som_big.train_random(X_std, num_iteration=10000)

# Collect neuron codebook vectors and cluster to 4 regimes
codebook = som_big.get_weights().reshape(100, -1)
km = KMeans(n_clusters=4, random_state=42, n_init="auto").fit(codebook)
node_regime = km.labels_.reshape(10,10)

# Day-level labels: regime of each day’s BMU
bmus_big = np.array([som_big.winner(x) for x in X_std])
labels4 = np.array([node_regime[i,j] for i,j in bmus_big])

# Composites by regime (as before)
regime_composites_km = []
for r in range(4):
    comp_w = anom_fields[labels4==r].mean(axis=0)
    comp = comp_w / w.values[:,None]
    regime_composites_km.append(comp)
 

Results at a glance

Both approaches are able to recover the four Euro-Atlantic winter weather regimes, at least to some extent. Exact spatial detail and regime frequencies depend on the training setup and analysis window. By construction, the approach combining SOM and k-means presents results closer to the standard method using PCA in combination with the clustering. 

 

Figure 1: 500 hPa geopotential height anomalies representative of the four winter weather regimes as obtained from the direct results of the SOM. The names assigned to regimes respond to their similarity to the four traditional winter weather regimes.
 

 

Figure 2: 500 hPa geopotential height anomalies representative of the four winter weather regimes as obtained from the application of a 10 by 10 SOM, followed by k-means clustering.

 Using SOM in conjunction with the k-means clustering leads to very similar results to the standard winter weather regimes (https://conorsweeneyucd.blogspot.com/2025/10/deriving-four-winter-weather-regimes.html). However, using only SOM leads to well defined NAO Negative and Scandinavian Blocking regimes, but to some confusion for the NAO Positive and Atlantic Ridge regimes, none of which show clear signal in the weather regimes derived from just SOM.

Sensitivity and limitations

  • No guaranteed optimum: increasing iterations does not ensure “the” answer; too many can overfit to single days.
  • Hyperparameters matter: grid size, neighbourhood radius and its change function, learning rate and its change function, or initialisation, among others, can change the results of the SOM.
  • Domain choices matter: spatial, season definition (e.g., DJF vs. NDJFM), temporal resolution (daily vs 6-hourly or 3-hourly), variable (z500 vs. MSLP), and detrending choices can all shift patterns and frequencies.

Practical pointers

  • Run short sensitivity sweeps (e.g., seeds, iterations ±20%, radius schedule, or even larger SOM networks) to compare regime stability.
  • Validate on sub-periods (e.g., 1991–2005 vs. 2006–2020) to confirm persistence.
  • Keep daily labels to study links to relevant surface-level features

Reproducibility checklist

  • Random seeds are fixed for initialisation and k-means.
  • Training parameters are tracked (iterations, training rate and its decay, neighbourhood function and its decay).
  • Domain, variable, season definition, and anomaly/climatology window are registered.
  • Exact ERA5 file versions and preprocessing steps are taken.

References

  1. How to make use of weather regimes in extended-range predictions for Europe (ECMWF Newsletter 165 - Autumn 2020).
  2. T. Kohonen (2001). Self-Organizing Maps, Springer. https://doi.org/10.1007/978-3-642-56927-2
  3. Self-Organizing Maps (DataCamp tutorial).
  4. MiniSom (simple Python SOM implementation).
  5. Ohba, M., Kanno, Y., & Nohara, D. (2022). Climatology of dark doldrums in Japan. Renewable and Sustainable Energy Reviews, 155, 111927. https://doi.org/10.1016/j.rser.2021.111927

Thursday, 9 October 2025

Deriving the four winter weather regimes

Deriving the four winter weather regimes


Summary. The large-scale atmospheric circulation in the Euro-Atlantic region is comprised of repeating patterns known as weather regimes. This post explains how they were obtained from reanalysis data using a simple process: remove seasonal means, apply PCA to daily fields, then cluster with k-means into the four standard winter weather regimes. The result is a set of four weather regimes and the distance of every day to each of the regimes.

Why look at regimes?

Regimes are quasi-stationary, persistent large-scale patterns that have an effect on surface weather, which in turn is associated with the generation of renewable energy and the electricity demand (temperature, wind speed, presence or absence of clouds). They add structure at the sub-seasonal range and support consistent discussions of risk and planning. Their identification is sensitive to analysis choices, so methodological consistency is important. For more information on weather regimes, refer to the ECMWF Newsletter 165 [1].

Scope and data

  • Dataset: ERA5 reanalysis
  • Region: 90°W–30°E, 20–80°N (Euro-Atlantic region)
  • Time period: Winters from 1979 to 2018
  • Temporal and spatial resolution: Daily data at 1° resolution
  • Field: 500 hPa geopotential height, which may need to be transformed from geopotential (the variable available in ERA5) by dividing it by the acceleration of gravity at surface (other studies also use 700 hPa or MSLP)

Method

  1. Pre-processing: Compute daily anomalies (remove winter mean) and apply latitude-cosine weighting
  2. PCA: Reduce dimensionality; keep 14 components to preserve close to 90% of variance and suppress small-scale noise
  3. Clustering: k-means with k=4, initialized from the first four PCs assigns each day to its nearest cluster
  4. Back-projection: Invert PCA and remove the latitude weighting to obtain regime centers in regular map space
 

 
Figure 1. Cluster-center 500 hPa geopotential height anomalies for the four winter weather regimes (1979–2018) based on the representation in Van der Wiel et al. (2019) [2].

Outputs

  • Regime maps: Four cluster-center anomaly fields describing typical large-scale flow states
  • Daily diagnostics: Distances from each day to each regime center; useful as a “strength of match” metric. Often, the only factor considered is which one is the closest cluster to the daily average for any given day

Some factors to consider

  • Results depend on domain, season, variable choice, and number of regimes; these should be kept fixed if comparisons are being made
  • PCA reduces computational cost and filters smaller, non-repeating patterns, but also smooths local features; PCA may not be suitable for cases where local features are of interest
  • Clusters summarize variability, but no single day will look like the cluster mean

Minimal code skeleton (for reproducibility) 

import xarray as xr
import numpy as np
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans

# Assume z500 is an xarray.DataArray of the 500hPa geopotential height with dimensions time, latitude, longitude
z500 = z500 * np.cos(np.radians(z500['latitude']))
z500 = z500 - z500.mean(dim='time')
z500_2d = z500.stack(space=('longitude','latitude'))

pca = PCA(n_components=14)
Z = pca.fit_transform(z500_2d)

kmeans = KMeans(n_clusters=4, init=[
  [1,0,0,0,0,0,0,0,0,0,0,0,0,0],
  [0,1,0,0,0,0,0,0,0,0,0,0,0,0],
  [0,0,1,0,0,0,0,0,0,0,0,0,0,0],
  [0,0,0,1,0,0,0,0,0,0,0,0,0,0]], verbose=1)

dist = kmeans.fit_transform(Z)
centers_geo = pca.inverse_transform(kmeans.cluster_centers_)

da_distance_to_centers = xr.DataArray(dist,
coords={'time':z500_2d['time'], 'cluster':np.arange(4)})
da_cluster_centers = xr.DataArray(centers_geo,
coords={'cluster':np.arange(4), 'space':z500_2d['space']}).unstack()

da_cluster_centers =
da_cluster_centers/np.cos(np.radians(da_cluster_centers['latitude']))

# The output da_distance_to_centers is an xarray.DataArray that contains the distance from the z500 of each days to each of the cluster centers
# The output da_cluster_centers is an xarray.DataArray that contains the cluster centers for each of the four clusters 

Sample view of the internal python workflow

References

  1. ECMWF Newsletter 165 (Autumn 2020): How to make use of weather regimes for extended-range predictions in Europe. https://www.ecmwf.int/en/newsletter/165/meteorology/how-make-use-weather-regimes-extended-range-predictions-europe ↩︎
  2. van der Wiel, K., Bloomfield, H. C., Lee, R. W., Stoop, L. P., Blackport, R., Screen, J. A., & Selten, F. M. (2019). The influence of weather regimes on European renewable energy production and demand. Environmental Research Letters, 14(9), 094010. DOI: https://doi.org/10.1088/1748-9326/ab38d3 ↩︎

Wednesday, 28 May 2025

NexSys Away Day in Wicklow

 

NexSys Away Day in Wicklow

On May 27th, NexSys held an away day at the beautiful Glenview Hotel, near the Glen of the Downs in Wicklow. The day began with an opening presentation from NexSys director, Terrence O'Donnell, who gave us an update on the group's progress and outlined the schedule for the day. The event was divided into two parts: a morning session focused on data and model integration, which included group discussions led by James O'Donnell, followed by small group talks to share insights with the wider team.
 
Following an enjoyable lunch at the hotel’s restaurant, where I had the opportunity to engage in discussions with senior researchers from NexSys. The afternoon session was dedicated to developing greater research collaboration. Led by Michelle Carey from the School of Mathematics and Statistics at UCD, the session allowed us to learn about each other’s work, share challenges, and identify ways to support one another’s research. For instance, during my discussion, I learned that a fellow PhD student was developing a power system model that incorporated electricity prices under different scenarios. He mentioned his need for data on solar and wind generation over extended periods, something that I could provide based on my own work.
 
To close out the day, we embarked on a guided hike through the stunning Glen of the Downs, offering a perfect opportunity to unwind and connect with colleagues. The day concluded with a lovely BBQ, leaving us all with new insights and strengthened connections among the researchers in the NexSys group.