Artificial Intelligence

Machine learning can be defined as the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyse and draw inferences from patterns in data. Deep learning (DL; LeCun et al, 2015) is a type of machine learning based on artificial neural networks in which multiple layers of processing are used to extract progressively higher level features from data. ARCAFF utilises end-to-end DL models to significantly improve upon traditional flare forecasting capabilities.

The rapid expansion and success of AI and in particular DL in recent years across many research fields has been fueled by three key advances:

  • Data – The creation and storage of large datasets necessary to train deep networks with millions of parameters.
  • Hardware – Development of graphics processing units (GPU) with the compute power necessary to train deep networks in reasonable amounts of time.
  • Software – Algorithmic development of the techniques and models required to train deep networks.

DL has already been successfully developed and deployed in weather forecasting, financial services, and health care domains, but has not been fully exploited in the solar physics domain. The large amount of available space-based solar observations are an ideal candidate for this type of analysis, given DL effectiveness in modelling complex relationships.

Using AI in ARCAFF

The basis of ARCAFF is the development, training and deployment of a series of deep neural networks (DNNs) to deliver the project objectives. The figure below outlines the typical stages in the life cycle of a DL project indicating the iterative nature of the major cycle. In our methodology each ARCAFF objective is an independent but related DL project following each stage.

1. Data Acquisition

The first and one of the most important stages of a DL project. The scope, scale, and types of data necessary to solve the challenge need to be defined and this data then needs to be obtained and catalogued.

In the case of ARCAFF the scope and types of data are solar observations (images), active region (AR) properties (text), and flare events (text). The scale is all available data from 1996 until the present, as of 2022 this means ~35,000 AR classifications from NOAA Solar Region Summaries and Met Office Sunspot Region Summaries, ~25,000 flares (≥ C class) from GOES X-Ray observations, and millions of associated solar images (see below table).

InstrumentObservation typeData availability Archive Size
SOHO EITEUV observations of solar atmosphere~500,000 1k x 1k images in 4 wavelength bands from 1996-2022.~4 TB 
SOHO MDIVisible and magnetic field observations of the solar surface.~100,000 1k x 1k images 1996-2011. ~1 TB 
SDO AIAEUV observations of solar atmosphere>250,000,000 4k x 4k images in 10 wavelength bands every 10 seconds from 2011 to present to supplement main data set from SOHO EIT~4 PB 
SDO HMIVisible and magnetic field observations of the solar surface>22,000,000 4k x 4k images every 45 seconds from 2011 to present to supplement main data set from SOHO MDI~350 TB

2. Data Preparation

The next and often most time-consuming stage in deep learning. Data cleaning, or wrangling as it is sometimes known, involves a number of tasks with the aim of creating a dataset suitable for the training phase. Common steps are cleaning or the removal of bad or otherwise malformed data, pre-processing and labelling. This is also often the stage where data is split between train, test and evaluation sets.

A key challenge for ARCAFF is the merging of GOES solar flare list information with source active region information. In addition, significant pre-processing is necessary to bring the image data to a level where it can be analysed. These scientific preparation or calibration processes usually consist of tasks such as flat fielding, hot pixel flagging, calibration, point-spread-function and optical distortion corrections. For ARCAFF data alignment and scaling is also of particular importance. ARCAFF will make extensive use of open source scientific software such as SunPy (The SunPy Community et al 2020) and AIApy (Barnes et al 2020) to develop the data preparation pipelines for each objective.

The Data Acquisition and Data Preparation stages are led by DIAS with support from UNIGE and UoW under WP2 “Solar Sources”.

3. Model Development and Training

The fundamental stage of any DL project, where models are developed or adapted to address the given problem. Model development covers every aspect of the model, for example the network architecture, activation functions, gradient descent algorithms, loss function, and metrics. This step is also where the training pipeline is created to train the model on the data. Data augmentation is often performed during training to increase the diversity of the training data to  reduce overfitting and other negative effects.

ARCAFF will make extensive use of open source machine learning libraries such as TensorFlow and PyTorch, and higher level libraries like Keras and PyTorch Lightning, to reduce the development effort. ARCAFF will also build upon already existing extensive DL research into similar tasks, for example object classification, object localisation and classification, audio classification, and speech-to-text applications.

ObjectivesRelevant DL ModelsComment
AR ClassificationAlexNet, VGG, GoogleNet/Inception, RESNETReplace input with solar image cutouts and outputs with AR classifications
AR Localisation and ClassificationR-CNN, Fast-RCNN and Faster-RCN, YOLOv1-v3Replace input with solar images and outputs with AR classifications and bounding boxes
Point-in-time flare forecast using full-disk magnetogramsR-CNN, Fast-RCNN and Faster-RCN, YOLOv1-v3Modify output layers and activations to model C, M and X flare probabilities
Point-in-time flare forecast using full disk multimodal observationsR-CNN, Fast-RCNN and Faster-RCN, YOLOv1-v3Modify output layers and activation to model C, M and X flare probabilities
Time series flare forecasts using full disk multimodal observationsDeepSpeech2, LASAdd additional feature encoder to create featuregrams and modify output layers and activation to output sequence of flare probabilities

4. Evaluation and Tuning

After training a model its performance needs to be evaluated to judge if it meets the required performance metrics. Depending on the task different metrics or combinations of metrics may be evaluated, ensuring the ones used align with current international flare forecasting benchmarks based on meteorological standards (Leka et al, 2019).

The other part of this stage is hyperparameter tuning. Hyperparameters are parameters that are not updated during training via gradient descent. As ARCAFF will be adapting the existing networks the hyperparameter space will be drastically reduced as many will have suggested ranges or be fixed by design, but some hyperparameters will still need to be tuned. ARCAFF will make use of open source software to monitor (TensorBoardX) and efficiently perform hyperparameter searches (Ray Tune) to find optimal values both power by the scalable compute platform.

The Model Development and Training, and Evaluation and Tuning stages will be led by UNIGE with support from DIAS and UoW under WP3 “Deep Learning”.

5. Deployment

The stage where the trained model is used to perform inference on new unseen data. Depending on the project this may be as simple as publishing the code and trained network or the creation of a completely reproducible ML environment or in between.

In the case of ARCAFF deployment means the integration of the trained models into backend processing and publishing of model outputs from new data. The latest SolarMonitor 2.0 backend (recently developed by DIAS) is written in Python using similar distributed task queues as the ARCAFF compute platform, which will simplify this integration. New RESTful API endpoints will be created for each objective to access historical and near real time outputs from the models

6. Operations

The last and longest stage of the DL life cycle. The operational stage is focused on monitoring and ensuring the performance and availability of the DL solution. A key aspect of this is ensuring expected performance of the system does not degrade on new data. For example, for ARCAFF new AR classifications would be compared to those from human systems to make sure the accuracy of the DL classifications does not decrease and if it did so to trigger action to analyse and correct the degradation. Another important aspect of the operational stage is ensuring the availability of the services, monitoring input data for consistency, monitoring the correct operation of the DL model and alerting operators to detected issues.

The deployment and operational stages will be led by UoW with support from SZTAKI, DIAS and UNIGE under WP4 “Computing Platform”.


The team has not identified any ethical issues surrounding the use of AI in this project, including ethical issues related to human rights and values. The use of AI-based algorithms is for a very restricted and clearly defined purpose, the system using images of the Sun and solar flare event properties for both training and inference tasks. Therefore there cannot be any potential discrimination against humans as a result of the system, and the system will not have any potential to lead to negative social or environmental impacts.