Introduction to SPHEREx Spectral Images#
1. Learning Goals#
Query for SPHEREx Spectral Image Multi-Extension FITS files (MEFs) that overlap a given coordinate on the sky.
Read in a SPHEREx Spectral Image MEF using Astropy and understand its structure.
Interactively visualize a SPHEREx Spectral Image MEF using Firefly.
Explore each extension. For the Spectral Image extension, this includes understanding the astrometric and spectral WCS systems.
2. SPHEREx Overview#
SPHEREx is a NASA Astrophysics Medium Explorer mission that launched in March 2025. During its planned two-year mission, SPHEREx will obtain 0.75-5 micron spectroscopy over the entire sky, with deeper data in the SPHEREx Deep Fields. SPHEREx data will be used to:
constrain the physics of inflation by measuring its imprints on the three-dimensional large-scale distribution of matter,
trace the history of galactic light production through a deep multi-band measurement of large-scale clustering,
investigate the abundance and composition of water and biogenic ices in the early phases of star and planetary disk formation.
The community will also mine SPHEREx data and combine it with synergistic data sets to address a variety of additional topics in astrophysics.
More information is available in the SPHEREx Explanatory Supplement.
3. Requirements#
The following packages must be installed to run this notebook. Comment out the following lines if they are already installed.
# !pip install numpy matplotlib astropy pyvo firefly-client
4. Imports#
import numpy as np
import matplotlib.pyplot as plt
from astropy.io import fits
from astropy.wcs import WCS
from astropy.table import Table
from astropy import units as u
from astropy.coordinates import SkyCoord
import pyvo
from pyvo.dal.adhoc import DatalinkResults
from firefly_client import FireflyClient
5. Search for SPHEREx Spectral Image MEFs that overlap coordinates you are interested in#
Define some coordinates of interest.
ra_deg = 304.693508808
dec_deg = 42.4436872991
Query IRSA for a list of Spectral Image MEFs that overlap this position.
# Define the TAP service URL for IRSA
tap_url = "https://irsa.ipac.caltech.edu/TAP"
# Connect to the TAP service
service = pyvo.dal.TAPService(tap_url)
# Define your ADQL query
query = "SELECT * FROM spherex.obscore WHERE CONTAINS(POINT('ICRS',"+str(ra_deg)+","+str(dec_deg)+"), s_region)=1"
# Submit the asynchronous query
job = service.submit_job(query)
# Run the job (starts the query execution on the server)
job.run()
# Wait for the job to complete (polling)
job.wait(phases=["COMPLETED", "ERROR", "ABORTED"], timeout=300)
# Capture the results
results = job.fetch_result()
Each row of the results of your query represents a different spectral image. Because SPHEREx data will be released on a weekly basis, the number of rows returned will change depending on when you submit the query. Let’s see how many images are returned today.
len(results)
71
The query results provide a lot of metadata about each spectral image. These columns have standard names as defined by the IVOA. Let’s list them:
list(results.fieldnames)
['obs_id',
's_ra',
's_dec',
'energy_bandpassname',
'em_min',
'em_max',
't_min',
't_max',
't_exptime',
'access_url',
'access_format',
'dataproduct_type',
'calib_level',
'target_name',
'obs_title',
'obs_collection',
'facility_name',
'instrument_name',
'obs_creation_date',
'obs_creator_name',
'obs_release_date',
'obs_publisher_did',
'publisher_id',
'bib_reference',
'data_rights',
's_fov',
's_region',
's_resolution',
's_xel1',
's_xel2',
's_ucd',
's_unit',
's_resolution_min',
's_resolution_max',
's_calib_status',
's_pixel_scale',
't_xel',
't_resolution',
'em_xel',
'em_calib_status',
'em_res_power',
'pol_xel',
'proposal_id',
'energy_emband',
'position_timedependent',
'obs_intent',
'algorithm_name',
'facility_keywords',
'instrument_keywords',
'environment_photometric',
'proposal_project',
'target_type',
'target_standard',
'target_moving',
'target_keywords',
'proposal_pi',
'access_estsize',
'o_ucd',
'ipac_gid',
'ipac_pub_date',
'pt',
'poly']
The ‘access_url’ column is particularly important because it tells you how to access the data. Let’s look at the ‘access_url’ value for the first row:
results['access_url'][0]
'https://irsa.ipac.caltech.edu/datalink/links/spherex?ID=ivo://irsa.ipac/spherex_qr?2025W19_1B_0166_3/D2'
Examining this URL, it does not provide direct access to the SPHEREx spectral image. Rather, it returns a file that lists all the data products and services associated with this spectral image. For the SPHEREx Quick Release products, this includes:
(1) the primary product, a Spectral Image MEF (‘semantics’ column is #this); and
(2) a cutout service (‘semantics’ is #cutout).
Most users will be interested in the primary (#this) product. Here’s how you get the URL to download it for the first row:
datalink_url = results['access_url'][0]
datalink_content = DatalinkResults.from_result_url(datalink_url)
spectral_image_url = next(datalink_content.bysemantics("#this")).access_url
spectral_image_url
'https://irsa.ipac.caltech.edu/ibe/data/spherex/qr/level2/2025W19_1B/l2b-v11-2025-162/2/level2_2025W19_1B_0166_3D2_spx_l2b-v11-2025-162.fits'
You can put this URL into a browser to download the file. Or you can work with it in Python, as shown below.
6. Examine the header of one of the SPHEREx Spectral Image MEFs#
Use Astropy to examine the header of the URL from the previous step.
hdulist = fits.open(spectral_image_url)
hdulist.info()
Filename: /home/runner/.astropy/cache/download/url/d731960493e5d787cd87171288e173ea/contents
No. Name Ver Type Cards Dimensions Format
0 PRIMARY 1 PrimaryHDU 5 ()
1 IMAGE 1 ImageHDU 218 (2040, 2040) float32
2 FLAGS 1 ImageHDU 146 (2040, 2040) int32
3 VARIANCE 1 ImageHDU 119 (2040, 2040) float32
4 ZODI 1 ImageHDU 119 (2040, 2040) float32
5 PSF 1 ImageHDU 509 (101, 101, 121) float32
6 WCS-WAVE 1 BinTableHDU 18 1R x 3C [9J, 9J, 162E]
You can see that the Level 2 Spectral Image files are multi-extension FITS files (MEFs) with the following extensions:
IMAGE: Calibrated fluxes for a detector array in scientific units of MJy/sr.
FLAGS: A bitmap with per-pixel status and processing flags.
VARIANCE: A per-pixel estimate of the variance.
ZODI: An model of the zodiacal dust background signal. Note that this is not subtracted from the IMAGE extension.
PSF: An image cube (3D array) where each plane represents a Point Spread Function (PSF) for a cutout in over-sampled pixel space.
WCS-WAVE: Spectral WCS lookup table that maps pixel coordinates to the central wavelength and bandwidth of each pixel.
We will tour each extension in the sections below.
7. Visualize a SPHEREx Spectral Image MEF using the Firefly Python Client.#
We will use the open-source astronomy data visualization software Firefly visualize SPHEREx data. Firefly has a Python client and it understands the alternative WCS coordinates of SPHEREx, allowing users to see how the wavelength and bandwidth vary with spectral image pixels.
Open a Firefly viewer in a separate browser tab and initialize it.
fc = FireflyClient.make_client(url="https://irsa.ipac.caltech.edu/irsaviewer")
fc.reinit_viewer()
{'success': True}
Visualize a spectral image MEF by sending its URL to the viewer.
fc.show_fits(url=spectral_image_url,
plot_id="spectral_image",
Title="Spectral Image"
)
{'success': True}
Try use the interactive tools in the viewer to explore the data.
8. Explore the first extension: IMAGE#
The first extension of the MEF is the the calibrated surface brightness flux density in units of MJy/sr, stored as a 2040 x 2040 image. No zodiacal light subtraction is applied.
The SPHEREx focal plane is split with a dichroic to three short-wavelength and three long-wavelength detector arrays. Two focal plane assemblies (FPAs) simultaneously image the sky through a dichroic beam splitter. Each FPA contains three 2K x 2K detector arrays placed behind a set of linear variable filters (LVFs), providing narrow-band response with a band center that varies along one axis of the array. SPHEREx obtains spectra through multiple exposures, placing a given source at multiple positions in the field of view, where it is measured at multiple wavelengths by repointing the spacecraft.
Band 1: λ= 0.75 - 1.09 µm; R=39
Band 2: λ= 1.10 - 1.62 µm; R=41
Band 3: λ= 1.63 - 2.41 µm; R=41
Band 4: λ= 2.42 - 3.82 µm; R=35
Band 5: λ= 3.83 - 4.41 µm; R=112
Band 6: λ= 4.42 - 5.00 µm; R=128
Examine the header of the first extension, printing out select keywords.
spectral_image_header = hdulist[1].header
keywords_to_print = ['EXTNAME', 'NAXIS1', 'NAXIS2', 'BUNIT', 'DETECTOR', 'OBSID', 'DATE', 'PSF_FWHM']
for keyword in keywords_to_print:
value = spectral_image_header.get(keyword, 'Keyword not found')
print(f"{keyword}: {value}")
EXTNAME: IMAGE
NAXIS1: 2040
NAXIS2: 2040
BUNIT: MJy / sr
DETECTOR: 2
OBSID: 2025W19_1B_0166_3
DATE: 2025-06-11T21:23:48.435
PSF_FWHM: 4.8428833067869315
We can see that this image was taken with Detector 2, so we can expect the wavelength to vary from 1.10 - 1.62 µm. Try examining the Firefly visualization to confirm this.
Notice that there is more than one WCS!
spectral_image_header['WCSNAME*']
WCSNAMEA= '0-based active pixel coordinates' / Coordinate system title
WCSNAMEW= 'Approximated per-pixel wavelength and bandpass' / WCS label
The main WCS describes the astrometric registration of the image, including optical distortion parameters, based on the FITS pixel convention starting with 1.
There are two alternative WCS systems:
WCSNAMEA describes zero-based pixel coordinates.
WCSNAMEW describes spectral coordinates ‘Wavelength’ and ‘Bandpass’. This WCS contains a reference to the lookup table in the ‘WCS-WAVE’ extension.
8a. Spatial WCS#
Load the Spatial WCS
wcs = WCS(spectral_image_header)
wcs
WCS Keywords
Number of WCS axes: 2
CTYPE : 'RA---TAN-SIP' 'DEC--TAN-SIP'
CRVAL : 303.461869122 42.4452278192
CRPIX : 1020.5 1020.5
PC1_1 PC1_2 : -0.00154336213576 -0.000724559607874
PC2_1 PC2_2 : -0.000722187019271 0.0015448687457
CDELT : 1.0 1.0
NAXIS : 2040 2040
Use standard Astropy methods to resolve the central ra and dec (given by crval1 and crval2 in the header) into image pixel coordinates.
ra, dec = wcs.wcs.crval
x, y = wcs.world_to_pixel(SkyCoord(ra=ra, dec=dec, unit="deg"))
print(ra, dec, x, y)
303.461869122 42.4452278192 1019.5121461655142 1020.606089905646
8b. Spectral WCS#
The use of Linear Variable Filters (LVFs) is a key component of SPHEREx imaging. Each pixel of the detector corresponds to a slightly different wavelength and bandwidth due to the LVF’s gradual variation in spectral transmission across its surface. Contours of constant wavelength are curved due to the method of filter fabrication, so the wavelength-vs-pixel function is inherently two-dimensional.
A compact, approximate representation of the wavelength and bandwidth per pixel is included in each image using the WAVE-TAB lookup-table mechanism defined in the FITS standard.
Below we illustrate how to use Spectral WCS to find an approximate wavelength at each pixel.
# Load the Spectral WCS from the header.
# Note that we need to provide a reference to HDU List, which contains a lookup table.
spectral_wcs = WCS(header=hdulist["IMAGE"].header, fobj=hdulist, key="W")
INFO:
Inconsistent SIP distortion information is present in the FITS header and the WCS object:
SIP coefficients were detected, but CTYPE is missing a "-SIP" suffix.
astropy.wcs is using the SIP distortion coefficients,
therefore the coordinates calculated here might be incorrect.
If you do not want to apply the SIP distortion coefficients,
please remove the SIP coefficients from the FITS header or the
WCS object. As an example, if the image is already distortion-corrected
(e.g., drizzled) then distortion components should not apply and the SIP
coefficients should be removed.
While the SIP distortion coefficients are being applied here, if that was indeed the intent,
for consistency please append "-SIP" to the CTYPE in the FITS header or the WCS object.
[astropy.wcs.wcs]
Note: The previous line triggers an Astropy INFO printout, which implies that the SIP distortion coefficients from the main WCS are preserved in the alternative WCS. This is because the SIP convention, not formally part of the FITS standard, is ambiguous as to whether it is meant to apply to ‘alternative’ (lettered) WCSes in addition to the primary WCS. See astropy/astropy#13105.)
The wavelength per pixel is a property of the detector-filter combination and is independent of optical distortion in the telescope, and is modeled accordingly in WCS ‘W’, so we turn the SIP distortion off for this WCS.
# SIP distortions must be turned off for the spectral WCS.
spectral_wcs.sip = None
# The standard Astropy methods for converting pixel coordinates to world coordinates can also be used to obtain spectral coordinates.
# Take the pixel coordinates that we determined for the image center and resolve them to the wavelength and bandpass for that pixel
wl, bp = spectral_wcs.pixel_to_world(x, y)
wl, bp
(<SpectralCoord 1.33597036 um>, <Quantity 0.03226981 um>)
8c. How does wavelength vary across the detector?#
# Read in the spectral image data.
spectral_image = hdulist["IMAGE"]
spectral_image_data = spectral_image.data
# Get arrays of pixel coordinates for the image.
(y, x) = np.indices(spectral_image.shape)
# Use the spectral WCS to convert these pixel coordinates to spectral coordinates.
spectral_coords = spectral_wcs.pixel_to_world(x, y)
# Break out the two spectral coordinates (wavelength and bandpass) from the spectral coordinates, and print the results.
wavelength, bandpass = spectral_coords
print("Wavelength: \n", wavelength)
print("Bandpass: \n", bandpass)
Wavelength:
[[1.66610098 1.66607023 1.66603947 ... 1.66646121 1.66649239 1.66652358]
[1.66577058 1.66573983 1.66570908 ... 1.6661311 1.66616228 1.66619346]
[1.66544017 1.66540943 1.66537868 ... 1.66580098 1.66583216 1.66586334]
...
[1.10041056 1.10038529 1.10036002 ... 1.10077924 1.10080494 1.10083064]
[1.10018568 1.10016041 1.10013514 ... 1.10055408 1.10057977 1.10060547]
[1.0999608 1.09993554 1.09991027 ... 1.10032891 1.10035461 1.1003803 ]] um
Bandpass:
[[0.04024398 0.04024324 0.0402425 ... 0.04025268 0.04025344 0.04025419]
[0.040236 0.04023526 0.04023452 ... 0.04024471 0.04024546 0.04024622]
[0.04022802 0.04022728 0.04022654 ... 0.04023674 0.04023749 0.04023824]
...
[0.02657996 0.02657935 0.02657874 ... 0.02658887 0.02658949 0.02659011]
[0.02657453 0.02657392 0.02657331 ... 0.02658343 0.02658405 0.02658467]
[0.0265691 0.02656849 0.02656788 ... 0.02657799 0.02657861 0.02657923]] um
# Plot the wavelength as a greyscale color map across the image pixels.
plt.imshow(wavelength.value, origin='lower', cmap='gray')
plt.colorbar(label=f"Wavelength [{wavelength.unit}]")
detector = hdulist["IMAGE"].header["DETECTOR"]
plt.title(f"Wavelength - Detector {detector}")
plt.show()

As expected, the wavelengths range from approximately 1.1=16 micron for Detector 2. You can see that the longest wavelengths are at the bottom and the shortest wavelengths are at the top. You can verify this is the case in the Firefly visualization, as well.
8d. Number of flagged pixels in this image#
Now let’s take a look at some header keywords that provide information about how many pixels have been flagged during processing.
spectral_image_header['L2 N_*']
HIERARCH L2 N_NONLINEAR = 0 / number of pixels flagged nonlinear
HIERARCH L2 N_PERSIST = 36875 / number of pixels flagged persist
HIERARCH L2 N_HOT = 3 / number of pixels flagged hot
HIERARCH L2 N_COLD = 767 / number of pixels flagged cold
HIERARCH L2 N_SOURCE = 4128273 / number of source flagged pixels
HIERARCH L2 N_OUTLIER = 5960 / number of flagged outlier pixels
HIERARCH L2 N_TRANSIENT = 79275 / number of flagged transient pixels
HIERARCH L2 N_OVERFLOW = 4115 / number of flagged overflow pixels
HIERARCH L2 N_SUR_ERROR = 24699 / number of flagged sur error pixels
HIERARCH L2 N_NONFUNC = 1789 / number of flagged nonfunctional pixels
HIERARCH L2 N_DICHROIC = 0 / number of flagged pixels due to low efficiency of d
HIERARCH L2 N_MISSING = 0 / number of flagged pixels with missing data
HIERARCH L2 N_FULLSAMPLE = 240 / number of flagged full-sample pixels
HIERARCH L2 N_PHANMISS = 0 / number of flagged phanmiss pixels
There are 14 flags in total. Typically, most of the pixels are identified as SOURCE pixels, which are pixels mapped to a known source. The remaining flags are described in Table 8 of the SPHEREx Explanatory Supplement.
9. Explore the second extension: FLAGS#
The second extension (FLAGS) provides a bitmap of per-pixel status and processing flags, stored as a 2040 x 2040 image.
Let’s take a look at the header of the FLAGS extension, and print out some header keywords of interest.
flags_header = hdulist[2].header
keywords_to_print = ['EXTNAME', 'NAXIS1', 'NAXIS2', 'BUNIT']
for keyword in keywords_to_print:
value = flags_header.get(keyword, 'Keyword not found')
print(f"{keyword}: {value}")
EXTNAME: FLAGS
NAXIS1: 2040
NAXIS2: 2040
BUNIT:
Conveniently, the definitions of the flags are also provided in the FLAGS header.
flags_header['MP*']
HIERARCH MP_TRANSIENT = 0 / Transient detected during SUR
HIERARCH MP_OVERFLOW = 1 / Overflow detected during SUR
HIERARCH MP_SUR_ERROR = 2 / Error in onboard processing
HIERARCH MP_PHANTOM = 4 / Phantom pixel
HIERARCH MP_REFERENCE = 5 / Reference pixel
HIERARCH MP_NONFUNC = 6 / Permanently unusable
HIERARCH MP_DICHROIC = 7 / Low efficiency due to dichroic
HIERARCH MP_MISSING_DATA = 9 / Onboard data lost
MP_HOT = 10 / Hot pixel
MP_COLD = 11 / Anomalously low signal
HIERARCH MP_FULLSAMPLE = 12 / Pixel full sample history is available
HIERARCH MP_PHANMISS = 14 / Phantom correction was not applied
HIERARCH MP_NONLINEAR = 15 / Pixel for which a reliable nonlinearity correction
HIERARCH MP_PERSIST = 17 / Persistent charge above threshold
HIERARCH MP_OUTLIER = 19 / Pixel flagged by Detect Outliers
HIERARCH MP_SOURCE = 21 / Pixel mapped to a known source
Tip: In the Firefly visualization, if you are looking at the first extension (IMAGE), you can open up the layers icon (top right) to enable overlaying a visualization of the flags.
10. Explore the third extension: VARIANCE#
The third extension of the MEF is the variance of the calibrated surface brightness flux in units of (MJy/sr)^2, stored as a 2,040 x 2,040 image. Let’s look at some specific header keywords:
variance_header = hdulist[3].header
keywords_to_print = ['EXTNAME', 'NAXIS1', 'NAXIS2', 'BUNIT']
for keyword in keywords_to_print:
value = variance_header.get(keyword, 'Keyword not found')
print(f"{keyword}: {value}")
EXTNAME: VARIANCE
NAXIS1: 2040
NAXIS2: 2040
BUNIT: MJy2 / sr2
11. Explore the fourth extension: ZODI#
The fourth extension of the MEF is the modeled zodiacal light background flux in units of MJy/sr, stored as a 2040 x 2040 image. This has not been subtracted from the IMAGE extension. Let’s examine some header keywords:
zodi_header = hdulist[4].header
keywords_to_print = ['EXTNAME', 'NAXIS1', 'NAXIS2', 'BUNIT']
for keyword in keywords_to_print:
value = zodi_header.get(keyword, 'Keyword not found')
print(f"{keyword}: {value}")
EXTNAME: ZODI
NAXIS1: 2040
NAXIS2: 2040
BUNIT: MJy / sr
12. Explore the fifth extension: PSF#
The fifth extension of the MEF contains 121 Point-spread functions (PSFs); each PSF is represented as a 101 x 101 image and all 121 are assembled together into a cube. Each of the 121 layers represents a “super-resolution” PSF estimate in a different region (defined by an 11x11 grid) of the detector. Each PSF is a two-dimensional array with size of 101 × 101 pixels. The PSFs are oversampled such that 10 PSF pixels cover the same spatial extent as one spectral image pixel (0.615 arcsec). Let’s look at some specific header keywords for the PSF:
psf_header = hdulist[5].header
keywords_to_print = ['EXTNAME', 'NAXIS1', 'NAXIS2', 'NAXIS3', 'BUNIT']
for keyword in keywords_to_print:
value = psf_header.get(keyword, 'Keyword not found')
print(f"{keyword}: {value}")
EXTNAME: PSF
NAXIS1: 101
NAXIS2: 101
NAXIS3: 121
BUNIT:
13. Explore the sixth extension: WCS-WAVE#
The sixth extension is a FITS-compliant spectral World Coordinate System (WCS) lookup table that maps spectral image pixel coordinates to central wavelengths and bandwidths. The lookup table consists of 1 row with 3 columns (X, Y, VALUES). X and Y are each arrays defining a grid of control points in spectral image pixel space. For each (X, Y) control point, VALUES defines a two-element array containing the central wavelength and the corresponding bandwidth. Originally adopted to support the unique nature of the SPHEREx LVF filters, this rarely-used part of the FITS standard has yet to be implemented by all readers. The Firefly client we use in this notebook does correctly interpret this lookup table.
Let’s look at the header of the WCS-WAVE extension:
wcs_wave_header = hdulist[6].header
wcs_wave_header
XTENSION= 'BINTABLE' / binary table extension
BITPIX = 8 / array data type
NAXIS = 2 / number of array dimensions
NAXIS1 = 720 / length of dimension 1
NAXIS2 = 1 / length of dimension 2
PCOUNT = 0 / number of group parameters
GCOUNT = 1 / number of groups
TFIELDS = 3 / number of table fields
EXTNAME = 'WCS-WAVE'
TTYPE1 = 'X '
TFORM1 = '9J '
TDIM1 = '(9) '
TTYPE2 = 'Y '
TFORM2 = '9J '
TDIM2 = '(9) '
TTYPE3 = 'VALUES '
TFORM3 = '162E '
TDIM3 = '(2,9,9) '
Unlike the other extensions, this is a binary table. Let’s read it iinto an Astropy table.
wcs_wave_table = Table(hdulist[6].data)
print("Number of rows:", len(wcs_wave_table))
print("Column names:", wcs_wave_table.colnames)
Number of rows: 1
Column names: ['X', 'Y', 'VALUES']
This table consists of just 1 row with three columns. Let’s inspect the columns:
x_array = wcs_wave_table["X"][0]
y_array = wcs_wave_table["Y"][0]
wavelengths = wcs_wave_table["VALUES"][0]
print("X:", x_array)
print("Y:",y_array)
print("Dimensions of VALUES array:", wavelengths.shape)
X: [ 1 256 511 766 1021 1276 1531 1786 2040]
Y: [ 1 256 511 766 1021 1276 1531 1786 2040]
Dimensions of VALUES array: (9, 9, 2)
Acknowledgements#
About this notebook#
Authors: IPAC Science Platform Team, including Jessica Krick, Troy Raen, Brigitta Sipőcz, Jaladh Singhal, Andreas Faisst, Shoubaneh Hemmati, Vandana Desai
Contact: IRSA Helpdesk with questions or problems.
Updated: June 2025
Runtime: approximately 30 seconds