Teaching Machines to See the Distant Universe

May 1, 2026

With help from thousands of citizen scientists, a new AI model is learning to spot real galaxies hidden in noisy HETDEX data.

Teaching AI to find galaxies in HETDEX data. Dark Energy Explorers volunteers classify small spectral cutouts as likely real galaxy signals or likely artifacts. Those human classifications help train an AI model to recognize faint Lyα-emitting galaxies hidden in noise. Image credit: Shiro Mukae / Erin Mentuch Cooper / OpenAI

Some of the most exciting galaxies in HETDEX start out looking like almost nothing: a faint bright spot in a small patch of noisy data.

But that tiny signal may be a distant Lyα-emitting galaxy, whose light has traveled billions of years before reaching the Hobby–Eberly Telescope.

Finding these galaxies is at the heart of HETDEX. The survey collects spectra without first choosing targets from images, which means it can reveal galaxies we did not already know were there. But it also creates a huge challenge: among hundreds of millions of spectra, not every detection is real.

A new paper led by visiting Research Scientist Shiro Mukae of the University of Texas at Austin tackles this problem with artificial intelligence—and help from real people. The team trained a convolutional neural network to identify likely real galaxy signals and filter out noise, artifacts, and sky-subtraction residuals in HETDEX data.

Citizen scientists helped train the AI

Through our Zooniverse project, Dark Energy Explorers, volunteers inspect HETDEX spectral cutouts and decide whether each signal looks real or likely spurious. “You can even classify galaxies 8–10 billion light-years away on your phone. You can swipe right for a real galaxy and left for noise,” says HETDEX Research Scientist Erin Mentuch Cooper, a co-author on the paper.

Those classifications are incredibly valuable. People are excellent at spotting subtle patterns: whether a signal has the right shape, location, and structure to be believable. In this work, those human judgments helped teach the AI what “real” looks like.

This is not AI replacing people. It is AI learning from people.

Why this matters

At high signal-to-noise, many galaxy candidates are easy to identify. At lower signal-to-noise, the difference between a real galaxy and a false detection can be subtle, even for experts.

That low-S/N regime matters because it contains many potential galaxies. If we can recover more real sources while rejecting false detections, HETDEX can build cleaner, larger galaxy samples for mapping the Universe.

The new model helps HETDEX push deeper into noisy data while keeping contamination under control.

Real people, real discoveries

Dark Energy Explorers volunteers helped create the training set behind this work. Their classifications are now helping an AI model search through a dataset far too large for any one person—or even any one team—to inspect by eye. “I am deeply grateful to the Dark Energy Explorers, whose careful classifications made this work possible and showed how human insight and AI can work together to uncover faint galaxy signals from the distant universe.” says Mukae.

Some galaxies are hidden in noise. Thanks to thousands of citizen scientists, we are getting better at finding them.

Read more and get involved

Read the paper: Mukae et al. 2026, “Enhancing Lyα Emitter Identification in HETDEX with a Convolutional Neural Network”.

And if you’d like to help classify HETDEX sources yourself, you can join Dark Energy Explorers on Zooniverse:
https://www.zooniverse.org/projects/erinmc/dark-energy-explorers

Every classification helps us better understand the Universe.

View Article Online
©2026 McDonald Observatory, The University of Texas at Austin   •   Site By CreativePickle