Reconstructing astronomical images with machine learning – Research Outreach

Much of what we know about how our universe works has been learnt by analysing the astronomical signals captured from the sky. However, these signals will inevitably have some noise associated with them – so how can astronomers be sure that their observations of strange, unexpected signals reflect reality? Edward Higson at the University of Cambridge and his colleagues have made progress in dealing with this issue, by constructing machine learning algorithms which can process noisy astronomical images to reconstruct clear ones. Using Bayesian statistics, their software provides astronomers with useful tools for processing their observations.

To gain knowledge about the universe, astronomers use a wide variety of telescopes to gather electromagnetic radiation from across the sky. As telescopes improve, and researchers observe the sky in ever-increasing detail, intriguing observations which have yet to be explained by theoretical physicists continue to roll in. However, obstacles ranging from dust clouds to imperfections in telescopes mean that no observation can be perfect; inevitably astronomical signals will have some degree of noise associated with them. Because of the complex nature of these signals, astronomers struggle to work out what their more unexpected signals should look like without any noise. In their research, Dr Higson and his colleagues have devised a cutting-edge solution to the issue involving image processing based on machine learning.

A little help from Bayes
When researchers need to make predictions about the likelihood of certain events occurring, they have a wide variety of statistical techniques to choose from. Many popular methods use Bayesian Statistics – an approach named after the English statistician Thomas Bayes in the 18th century, who considered how our confidence that a proposition is true changes as more information is gathered about it through experiments. Bayes modelled this statistical behaviour with an equation now known as ‘Bayes’ theorem’. Centuries later, Dr Higson and his colleagues are using his mathematics as a basis for neural networks which can successfully reconstruct noisy astronomical images.

“My research is focused on methods for using data to make inferences about the universe, including in cases where we do not have a good theoretical model for the signal we are observing,” Dr Higson explains. “I use Bayesian statistics, which defines probability as a ‘degree of belief’ in a proposition and can analyse non-repeatable events where our uncertainty is due to limited information.”

Using these methods to analyse inputs as complex as mysterious astronomical images is, understandably, a tall order. However, statistical techniques have advanced significantly since Bayes’ time, and are now allowing Dr Higson’s team to reliably reconstruct these images, based on the computing technique of ‘nested sampling’.

Applying nested sampling
Nested sampling is an increasingly popular computational technique for performing Bayesian statistics calculations. The approach was invented by Cambridge astrophysicist John Skilling in 2004 and is capable of handling challenging data sets and high-dimensional probability distributions.

Dr Higson is particularly interested in using nested sampling to reconstruct noisy astronomical images. The nested sampling software his team has developed can be used to analyse these images using Bayes’ mathematics. This allows a neural network to predict what a clear image should look like by trying to find a simple (‘sparse’) mathematical description of the image.

“We implement ‘Bayesian Sparse Reconstruction’ using nested sampling – a popular method for Bayesian computation which is widely used in astrophysics,” Dr Higson continues. “Much of our research has focused on improving this technique to make it better able to handle Bayesian Sparse Reconstruction problems, and on understanding the uncertainties in these calculations.” To get to this point, Dr Higson and his colleagues have carried out extensive research into how nested sampling can be used for Bayesian statistics calculations while taking full advantage of the current capabilities of modern computers.

I use Bayesian statistics, which … can analyse non-repeatable events where our uncertainty is due to limited information.

Improving nested sampling efficiency
Despite its usefulness, the rapid, repetitive calculations demanded by nested sampling-based machine learning can make the technique computationally intensive. In earlier research, Dr Higson’s team combated the issue by constructing algorithms which greatly reduced the computing power required to run nested sampling software. “‘Dynamic nested sampling’ is a generalisation of the nested sampling algorithm which can produce large improvements in computational efficiency compared to standard nested sampling,” explains Dr Higson.

Using the updated algorithms, the researchers wrote software which could efficiently carry out Bayesian statistical analysis on large sets of data. As Dr Higson explains, dynamic nested sampling promises to become a powerful tool for processing the stream of data and images which modern telescopes produce. “Dynamic nested sampling has now been applied to a variety of problems in astrophysics,” he continues. “These increases in computational efficiency combined with advances in nested sampling software have the potential to allow the Bayesian sparse reconstruction framework to be applied to larger and more complex data sets in the future.”

However, the efficiency of their algorithms wasn’t the only difficulty in the implementation of nested sampling. To ensure that their reconstructed images were reliable, the researchers first needed to find ways to understand errors in the calculations their software performed and estimate how big the possible inaccuracies were.

Checking for errors
Dynamic nested sampling grew from Dr Higson and his colleagues’ efforts to understand the sources of errors in their software then teach it to automatically identify and improve the parts of the calculation which were most likely to contain errors. “We analysed the sampling errors in nested sampling parameter estimation and created a method for estimating them numerically for a single nested sampling calculation,” says Dr Higson. The team then used this estimation technique as a basis for the dynamic nested sampling algorithm, which automatically identifies what their software should do to best improve calculation accuracy.

These efforts also lead to another software package which provides error analysis for nested sampling calculations. “We developed diagnostic tests for detecting when software has not performed the nested sampling algorithm accurately,” Dr Higson continues. “The uncertainty estimates, and diagnostics, are implemented in the ‘nestcheck’ software package.”

By integrating nestcheck into their dynamic nested sampling algorithms, the researchers brought together all of their efforts to improve the efficiency and accuracy of their software. In their latest research, Dr Higson and his colleagues have used their advanced software to demonstrate their Bayesian Sparse Reconstruction techniques.

We implement ‘Bayesian Sparse Reconstruction’ using nested sampling – a popular method for Bayesian computation.

A basis for signal reconstruction
In the 2019 study, Dr Higson’s team first developed methods to reconstruct signals by building up models of them through machine learning which seek to express the signal as simply as possible. The elemental building blocks of these models are called ‘basis functions’ – sets of mathematical functions which can be used to approximate the fundamental properties of astronomical signals. As Dr Higson explains, “we presented a principled Bayesian framework for signal reconstruction, in which the signal is modelled by basis functions whose number and form is determined by the data themselves”.

Using this framework, the team established a principled method for identifying clear signals in noisy data. This first part of the study, therefore, provided them with a basis on which they could test the performance of their algorithms in real situations.

Reconstructing real signals
In the next part of the study, Dr Higson and colleagues performed their technique on noisy images gathered by real telescopes. The software was also used to process images using neural networks – computational systems inspired by the processes which play out in our brains – which in this application can be used in a similar manner to basis functions. In this case, the team’s algorithms used neural networks to perform Bayesian statistics on noisy signals, producing clear images.

“We demonstrate our method for reconstructing noisy one- and two-dimensional signals, including examples of processing astronomical images,” explains Dr Higson. “We also applied the Bayesian sparse reconstruction framework to neural networks, where it allows us to work out the best type of neural network to use for fitting the data set.” Unlike previous algorithms, Dr Higson’s software allows the models to adapt their level of complexity to the complexity of the data.

A new toolkit for astronomers
Thanks to the work of Dr Higson and his colleagues, astronomers have access to a principled approach to using machine learning techniques for processing their images. Their software can help to reconstruct clear images even for the most strange and unexpected of observations, for which theoretical physicists have yet to offer any explanation. As telescopes continue to look at the universe in ever greater levels of detail, the software could be an important tool in ensuring that the reconstruction of increasingly strange and diverse astronomical observations can be truly reliable.

Personal Response

What particular images do you think will yield particularly interesting results after Bayesian sparse reconstruction?

The technique is particularly useful when noise and multiple overlapping sources make it difficult to determine how many astronomical objects are present in the image and what each looks like. A really exciting application is to the most distant photographs ever taken by the Hubble Space Telescope, which show a jumble of galaxies near the edge of the observable universe. Bayesian sparse reconstruction can be used to work out how many separate overlapping galaxies are present in such images, and what shape the individual galaxies are.


Article by channel:

Read more articles tagged: Machine Learning