Searching for Stellar Spectra

In this notebook we will search the CFHT ESPaDOnS archive for B-type stars in the Bright Star Catalogue with vsini values less than 20 km/s. A plot of the intensity spectrum from 6525Ã… to 6700Ã… will then be provided for each object found in the archive. To fetch and query data, we will use the astroquery package, particularly the CADC and Vizier modules.


To launch this notebook interactively, click the Binder button. Binder

Warning: Binder instances have limited computational resources. A notebook will shutdown after 10 minutes of inactivity and has a maximum of 2 GB of RAM. Additionally, each repository has a maximum of 100 simultaneous users.


Table of Contents

1. Introduction

The Bright Star Catalog (BSC) is an open data collection with basic astronomical information of bright stars.

2. Setup

This tutorial will go through some of the basic functionalities of the CADC module in the astroquery package. The module can be installed in two ways:

2.1 Using pip

The CADC module is only available with the 'bleeding edge' master version of the astroquery module, and can be installed using the command:

    pip install https://github.com/astropy/astroquery/archive/master.zip

2.2 From source

Alternatively, you can clone and install from the source:

    # If you have a github account:
    git clone git@github.com:astropy/astroquery.git
    # If you do not:
    git clone https://github.com/astropy/astroquery.git
    cd astroquery
    python setup.py install

Note that these commands can also be done in a Jupyter notebook by either declaring the code cell a bash cell by pasting %%bash at the top of the cell, or preceding each line with a !. More information about astroquery can be found at the astroquery github repository.

3. Querying

In order to get the stars that we want to look at and the spectra of those stars, we will have to do two queries: one of the Bright Star Catalog and the other of the CADC database.

3.1 Querying the VizieR Catalog

In order to query the BSC, we will use the VizieR database which is a library of published astronomical catalogs. Through VizieR, we can access and query the 5th Revised Ed. of the BSC. In this tutorial, we want stars of spectral type 'B' with rotation velocity less than 20 km/s.

In [1]:
from astroquery.vizier import Vizier

# Get the catalog name of the 5th Revised Ed. of the Bright Star Catalog
catalog_list = Vizier.find_catalogs('Bright Star Catalogue, 5th Revised Ed.')
print({k: v.description for k, v in catalog_list.items()})
catalog_key = list(catalog_list.keys())[0]
catalog = Vizier.get_catalogs(catalog_key)
catalog_name = catalog.keys()[0]

# Increase the row limit and query the catalog
Vizier.ROW_LIMIT = 200
result = Vizier.query_constraints(catalog=catalog_name,
                                  SpType='B*',
                                  RotVel='<20')
bright_star_results = result[catalog_name]
bright_star_results
{'V/50': 'Bright Star Catalogue, 5th Revised Ed. (Hoffleit+, 1991)'}
Out[1]:
Table masked=True length=85
HRNameHDADSVarIDRAJ2000DEJ2000VmagB-VSpTypeNoteFlag
"h:m:s""d:m:s"magmag
int16bytes10int32bytes5bytes9bytes10bytes9float32float32bytes20bytes1
3988Gam Peg886Gam Peg00 13 14.2+15 11 012.83-0.23B2IV*
62127900 17 09.1+47 56 515.89-0.09B7III
15317Zet Cas336022500 36 58.3+53 53 493.66-0.20B2IV*
20823 Cas438200 47 46.1+74 50 515.41-0.08B8III*
280Alp Scl573735900 58 36.4-29 21 274.31-0.16B7IIIp*
4659996GY And01 38 31.7+45 24 006.360.04B9pCrEu*
54245Eps Cas1141565201 54 23.7+63 40 123.38-0.15B3III*
5621190501 57 56.4+41 41 406.78-0.06B8III*
6771427202 19 37.3+39 50 066.63-0.09B8V
.................................
787819642620 37 18.3+00 05 496.22-0.08B8IIIp
809420143314682V389 Cyg21 08 38.9+30 12 215.59-0.10B9VpSi*
82262047541374521 28 52.7+55 25 076.120.14B8III
8349207857V1619 Cyg21 51 04.9+39 32 126.17-0.08B9pHgMn*
840421 Peg20945922 03 19.0+11 23 115.80-0.07B9.5V
845238 Aqr21042422 10 37.5-11 33 545.46-0.12B7III
8663Xi Oct21557322 50 22.9-80 07 275.35-0.15B6IV
870474 Aqr216494HI Aqr22 53 28.7-11 37 005.80-0.08B9III*
87062165381434622 53 11.3+40 10 026.34-0.08B7III-IV
887321992723 19 27.4+34 47 366.32-0.08B8III

3.2 Defining Search Parameters

We want to search the CADC data for products that:

  • Have energy bounds that overlap a wavelength range of 6675Ã… to 6682Ã… (so the cutouts we do later will provide profiles of the He I 6678Ã… line)
  • Have a release date before today, so we can access the data
  • Overlap the RA and Dec of the stars in our results table above within a radius of 0.01 degrees
  • Come from the ESPaDOnS spectropolarimeter at the Canada-France-Hawaii Telescope (CFHT)
  • Have a processed intensity spectrum available (meaning the archive product ID should match the pattern *i)
In [2]:
import datetime
from astropy.coordinates import SkyCoord
from astropy import units as u

# Define energy bounds in Angstrom and convert to metres
def ang_to_m(x):
    return x * 1e-10


energy_bounds_ang = (6675, 6682)
energy_bounds_m = (ang_to_m(energy_bounds_ang[0]),
                   ang_to_m(energy_bounds_ang[1]))

# Grab todays date and time
today = datetime.datetime.now().strftime("%Y-%m-%d %X")

# Get RA, Dec, HD, and Spectral Type of each star
ra_list = bright_star_results['RAJ2000']
dec_list = bright_star_results['DEJ2000']
hd_list = bright_star_results['HD']
sp_type_list = bright_star_results['SpType']

# Define radius of circle to search for intersection in degrees
radius = 0.01

# Build a list of skycoords from star RA and Dec
coords = [
    SkyCoord(ra, dec, frame='icrs', unit=(u.hourangle, u.deg))
    for ra, dec in zip(ra_list, dec_list)
]

# Define the instrument name, collection, and the product id string (using the wildcard operator for ADQL's LIKE)
instrument_name = 'ESPaDOnS'
collection = 'CFHT'
product_id = '%i'

3.3 Querying CADC using ADQL

To query the CADC database, we will upload the bright_star_results table to CADC and perform a join on the uploaded table to the Plane and Observation tables, joining on the RA and Dec. Then we will execute the WHERE statements to refine our search results.

In [3]:
from astropy.table import Column
from astropy.io import votable

# Uploading the table does not accept the '-' character so rename column B-V to remove it
bright_star_results.rename_column('B-V', 'B_V')

# Add RA and Dec columns that use the proper format for ADQL geometrical functions
ra_column = Column([coord.ra.degree for coord in coords], name='RA')
dec_column = Column([coord.dec.degree for coord in coords], name='DEC')
bright_star_results.add_column(ra_column)
bright_star_results.add_column(dec_column)
In [4]:
%%time
from astropy.table import unique
from astroquery.cadc import Cadc

cadc = Cadc()

column_list = ', '.join([
    'star_results.HR AS HR', 
    'star_results.HD AS HD',
    'star_results.SpType AS SpType', 
    'Observation.target_name AS target_name',
    'Plane.productID AS productID',
    'Plane.energy_bounds_samples AS energy_bounds_samples',
    'Plane.dataRelease AS dataRelease', 
    'Plane.publisherID AS caomPublisherID'
])

# Now we build our dictionary of query parameters
query_params = {
    'instrument_name': instrument_name,
    'radius': radius,
    'collection': collection,
    'prod_id': product_id,
    'wavelength_lower': energy_bounds_m[0],
    'wavelength_upper': energy_bounds_m[1],
    'data_release': today,
    'column_list': column_list,
}

query = '''SELECT {column_list} FROM tap_upload.bright_star_results as star_results 
JOIN caom2.Plane AS Plane
ON (INTERSECTS( CIRCLE('ICRS', star_results.RA, star_results.DEC, {radius}), Plane.position_bounds ) = 1)
JOIN caom2.Observation AS Observation ON Plane.obsID = Observation.obsID
WHERE  ( Observation.instrument_name = '{instrument_name}'
    AND Observation.collection = '{collection}' 
    AND Plane.productID LIKE '{prod_id}' 
    AND INTERSECTS( INTERVAL( {wavelength_lower}, {wavelength_upper} ), Plane.energy_bounds_samples ) = 1 
    AND Plane.dataRelease <= '{data_release}'
    AND ( Plane.quality_flag IS NULL OR Plane.quality_flag != 'junk' ) )
'''.format(column_list, **query_params)

espadons_results = cadc.exec_sync(
    query, uploads={'bright_star_results': bright_star_results})
espadons_results
CPU times: user 444 ms, sys: 51.6 ms, total: 495 ms
Wall time: 1min 56s

Now that the CADC query is done, we will return the unique rows on the HR identifier. Additionally, we will add a column to the result that contains a link to a CADC Advanced Search query result searching on the HD target. This will link to other ESPaDOnS data products on the same target.

In [5]:
import pandas as pd
from IPython.display import HTML


def make_clickable(val):
    # Target _blank to open new window
    return '<a target="_blank" href="{0}">{1}</a>'.format(val, 'Spectra link')


# Get unique results on HR and convert to pandas dataframe
unique_espadons_results = unique(espadons_results, keys='HR').to_pandas()
str_cols = [
    'SpType', 'target_name', 'productID', 'dataRelease', 'caomPublisherID'
]
str_results = unique_espadons_results[str_cols].stack().str.decode(
    'utf-8').unstack()
for col in str_results:
    unique_espadons_results[col] = str_results[col]

# Build the spectra query column
spectra_query_url = (
    'http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/en/search/?activateMAQ=true&'
    'Observation.intent=science&Plane.position.bounds@Shape1Resolver.value=ALL&'
    'Plane.position.bounds=HD%20{}&Plane.energy.bounds.samples=6675..6682A&'
    'Observation.collection=CFHT&Observation.instrument.name=ESPaDOnS&'
    'Plane.calibrationLevel=2')

unique_espadons_results['Spectra'] = [
    spectra_query_url.format(hd) for hd in unique_espadons_results['HD']
]

print("Number of results before filtering on unique HR: {}".format(
    len(espadons_results)))
print("Number of results after filtering on unique HR: {}".format(
    len(unique_espadons_results)))

unique_espadons_results.head().style.format({'Spectra': make_clickable})
Number of results before filtering on unique HR: 862
Number of results after filtering on unique HR: 21
Out[5]:
HR HD SpType target_name productID energy_bounds_samples dataRelease caomPublisherID Spectra
0 39 886 B2IV hd 886 1048298i [[3.6899011230500005e-07 1.04803515625e-06]] 2009-04-30T00:00:00.000 ivo://cadc.nrc.ca/CFHT?1048298/1048298i Spectra link
1 153 3360 B2IV hd 3360 1050186i [[3.6908309936500003e-07 1.04806433105e-06]] 2009-04-30T00:00:00.000 ivo://cadc.nrc.ca/CFHT?1050186/1050186i Spectra link
2 542 11415 B3III HD 11415 1168490i [[3.69754608154e-07 1.04804418945e-06]] 2011-08-31T00:00:00.000 ivo://cadc.nrc.ca/CFHT?1168490/1168490i Spectra link
3 779 16582 B2IV hd 16582 1048322i [[3.6899841308600005e-07 1.0480587158200001e-06]] 2009-04-30T00:00:00.000 ivo://cadc.nrc.ca/CFHT?1048322/1048322i Spectra link
4 811 17081 B7V HD 17081 781666i [[3.69087402344e-07 1.0480628662100002e-06]] 2006-08-31T00:00:00.000 ivo://cadc.nrc.ca/CFHT?781666/781666i Spectra link

4. Fetching Data

The purpose of the next block of code is to build the spectral cutout data access url for each of the data products. To do this, we use the datalink service, find the cutout definition, grab the access url, and append the cutout parameters to it. The result is a list of access urls with cutouts. We use the pyvo library to access the Datalink service. This library is made for use with IVOA services.

In [6]:
from pyvo.dal.adhoc import DatalinkResults
from six.moves.urllib.parse import urlencode

cutout_urls = []
cutout_params = {
    'BAND': '{} {}'.format(energy_bounds_m[0], energy_bounds_m[1])
}
datalink_url = 'https://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink'
publisher_ids = unique_espadons_results['caomPublisherID']

for pid in publisher_ids:

    # Get datalink results
    datalink_results = DatalinkResults.from_result_url('{}?{}'.format(
        datalink_url, urlencode({'ID': pid})))

    # Get the first cutout service definition
    service_def = next(datalink_results.bysemantics('#cutout'))

    access_url = service_def.access_url.decode('ascii')
    service_params = service_def.input_params
    input_params = next({param.name: param.value} for param in service_params
                        if param.name == 'ID')
    input_params.update(cutout_params)
    cutout_urls.append('{}?{}'.format(access_url, urlencode(input_params)))

print('Sample cutout url: {}'.format(cutout_urls[0]))
Sample cutout url: https://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/sync?ID=ad%3ACFHT%2F1048298i.fits.gz&BAND=6.675000000000001e-07+6.682e-07

Now we use the access urls to access the fits files and retrieve the data.

In [7]:
from astropy.io import fits

data_list = []
verbose = False

for url in cutout_urls:
    with fits.open(url) as hdulist:
        if verbose:
            hdulist.info()
        data = hdulist[0].data
        data_list.append(data)

5. Plotting Results

Now with access to the data, we can begin to plot the spectra.

In [8]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import math

line_identifications = [(r'H$\alpha$', 6562.8), ('C II', 6578.05), ('C II', 6582.88),
                        ('N II', 6610.56), ('He I', 6678.1517)]


def plot_spectra(data_list, hd_list, sp_list):
    cols = 2
    rows = math.ceil(len(data_list) / cols)
    ylim = (0.2, 1.2)
    xlim = (6525, 6700)
    fig, axes = plt.subplots(rows,
                             cols,
                             figsize=(15, 20),
                             sharex=False,
                             sharey=True)
    plt.setp(axes, ylim=ylim, xlim=xlim)
    for data, hd_name, sp_type, ax in zip(data_list, hd_list, sp_list,
                                          axes.flatten()):
        ax.plot(data[0] * 10, data[1])

        for element, wavelength in line_identifications:
            ax.axvline(x=wavelength, c='black', ls='--', alpha=0.5)
            ax.text(wavelength + 3, ylim[0] + 0.11, element, rotation=75)

        ax.set_title('Name: HD {}     Spec Type: {}'.format(hd_name, sp_type))
        ax.set_xlabel(r'Wavelength ($\AA$)')
        ax.set_ylabel('Relative Intensity')

    fig.suptitle('Intensity Spectrum', y=0.92, fontsize=18)
    plt.subplots_adjust(hspace=0.5)
    plt.show()


nsubplots = 10
hd_list = unique_espadons_results['HD']
sp_list = unique_espadons_results['SpType']
plot_spectra(data_list[:nsubplots], hd_list[:nsubplots], sp_list[:nsubplots])
In [ ]: