General Tutorial

by D. Ilić, N. Rakić, May 2023 (last modified)

Open source code FANTASY (Fully Automated pythoN tool for AGN Spectra analYsis) is a python based code for simultaneous multi-component fitting of AGN spectra, optimized for the optical rest-frame band (3600-8000A), but also applicable to the UV range (2000-3600A) and NIR range (8000-11000A).

The code fits simultaneously the underlying broken power-law continuum and sets of defined emission line lists. There is an option to import and fit together the Fe II model, to fully account for the rich emission of complex Fe II ion which can produce thousands of line transitions, both in the range of Hg and Hb, but also Ha line. The Fe II model is based on the atomic parameters of Fe II, and builds upon the models presented in Kovacevic et al. 2010, Shapovalova et al. 2012, and assumes that different Fe II lines are grouped according to the same lower energy level in the transition with line ratios connected through the line transition oscillatory strengths. The following multiplet in the range from 3800 to 11000A are used: e4D, e6D, z4F, z4D, z4P, a4G, a4H, b2H, b4D, b4F, b4P, a6S, a2D2, y4G, b4G, x4D, y4P (Ilic et al, 2023, submitted).

The code is flexible in the selection of different groups of lines, either already predefined lists (e.g., standard narrow lines, Hydrogen lines, Helium lines, other broad lines, forbidden Fe II lines, coronal lines, etc.), but gives full flexibility to the user to merge predefined line lists, create customized line list, etc. The code can be used either for fitting a single spectrum, as well as for the sample of spectra, which is illustrated in several given examples. It is an open source, easy to use code, available for installation through pip install option. It is based on Sherpa Python package (Burke et al. 2022).

[1]:
#Before starting, we call some of the standard python packages, such as matplotlib, pandas, numpy, etc.
import matplotlib.pyplot as plt
import matplotlib as mpl
from matplotlib.ticker import (MultipleLocator, FormatStrFormatter, AutoMinorLocator)

import numpy as np
import pandas as pd

import glob
  1. Reading data

First step involves reading a spectrum or set of spectra. We designed integrated three spectral classes: - read_sdss - designedto read SDSS fits (https://www.sdss.org) - read_gama_fits - designed to read GAMA fits (http://www.gama-survey.org) - read_text - designed to read any ASCII files (e.g., wavelength, flux, flux_error) - make_spec - designed to make Fantasy spectral class from any given three arrays for wavelength, flux, flux_error; user can also set the redshift and coordinates, which is needed in the later analysis: make_spec(wave, flux, err, z=, ra=, dec=)

The user has the possible to read any other file format using prefered method.

[2]:
# Below command import the above mentioned reading commands
from fantasy_agn.tools import read_sdss, read_text, read_gama_fits

# Below command import the necessary commands, which will be described later
from fantasy_agn.models import create_input_folder

from fantasy_agn.models import create_feii_model, create_model, create_tied_model, continuum, create_line, create_fixed_model
[3]:
# This command reads the spectrum with the listed name, from the folder of this notebook or
# from the given path e.g.,'/path/to/files/spec*.txt'

s=read_sdss('spec-2770-54510-0433.fits')

  1. Preparation of the spectra

Before performing the fittings, the FANTASY code provide several preprocessing options to prepare the spectrum, namely: - Setting wavelength frame from vacuum to air wavelengths - vac_to_air() - Galactic extinction correction - DeRedden(), based on Schlegel, Finkbeiner, Davis (1998) dust maps. - redshift correction - CorRed() - croping/scaling of spectra - crop(xmin,xmax)

[4]:
# We use for input models selected line lists, for which we provide air wavelengths; if your spectrum is in vacuum
# wavelengths, this command transform them to air wavelengths: s.vac_to_air()
# NOTE that read_sdss() transform to air wavelengths by default.
[5]:
# DeRedden() command corrects for the Galactic extinction, based on coordinates of the object provided in the fits,
# which will be automatically derived from dust map data from Schlegel, Finkbeiner, Davis (1998).
# If coordinates are not available you could manually insert them using simple commands s.ra=xxx.xxx, s.dec=xxx.xxx

# s.ra=29.519807539582 #example input of arbitrary rightascension
# s.dec=-0.872742349310271 #example input of arbitrary declination

s.DeRedden()
[6]:
# CorRed() corrects for the cosmological redshift, based on redshift of the object provided in the fits.
# If coordinates are not available you could manually insert them using simple command s.z=x.xxx

# s.z=0.0804 #example input of arbitrary redshift

s.CorRed()
[7]:
# Useful tip is to avoid using very small flux units.
# e.g., SDSS spectra are given in 1e-17 erg/s/cm2/A and these are already scaled within read_sdss()

#s.flux=s.flux*10**17
  1. Host-galaxy subtraction

Command fit_host_sdss() fits the observed spectrum using the approach already tested and used for SDSS spectra, which shows that the AGN spectra can be reconstructed as linear combinations of galaxy and quasar eigenspectra. Using the Principle Component Analysis, Yip et al. (2004a,b) constructed from 170,000 galaxy SDSS spectra and 16,707 quasar SDSS spectra, a set of galaxy and quasar eigenspectra. Vanden Berk et al. (2006) showed that majority of spectra can be reconstructed using the linear combination of 10 quasar eigenspectra from Yip et al. (2004b) and 5 galaxy eigenspectra from Yip et al. (2004a) . For more details check the above references.

Command fit_host() fits the observed spectrum with PCA which uses all available eigenvectors (10 eignevectors for the galaxy (stellar) and 15 QSO components). In this case the code will test for different number of components until reaching best result based on the chi2 parameter.

In both cases, the weighthed fit is used in order to avoid accounting for strong emission lines. If the host-galaxy contribution is negligible, i.e., the fitting results in a negative host-galaxy, the code will fail to find the host and inform the user.

#IMPORTANT: Code will by default subtract the host galaxy from the observed spectrum; if you need the original spectrum for some analysis, you can restore the original spectrum before the host fitting with the command restore().

  • host-galaxy fitting and subtraction - fit_host_sdss()

  • host-galaxy fitting and subtraction - fit_host()

TIP: Do not crop the spectrum before fitting for the host galaxy contribution, as the fittings strongly depends on the stellar absorption features that are more likely to be detected in a wider spectral range, i.e. the range in which the eigenvectors are provided (3600-8000A).

[8]:
#s.restore() #command which restor to the original spectrum, before host-galaxy removal;
s.fit_host_sdss()
Host contribution is negliglable

One possibility is to mask the narrow emission lines while performing the host galaxy fitting. This is done with setting the option mask_host=True, which by default masks all well known nebular lines ([O III], [N II], [S II]), but customized list of emission lines can be also given.

[9]:
s.restore() #command to restore the spectrum before host galaxy fitting and substraction
s.fit_host_sdss(mask_host=True, custom=False)
Host contribution is negliglable
[10]:
#Let's plot the spectrum for visual inspection.

plt.style.context(['nature', 'notebook'])
plt.figure(figsize=(12,6))
plt.plot(s.wave, s.flux, color="#929591", label='Obs', lw=2)
plt.legend(loc='upper right',  prop={'size': 20}, frameon=False, ncol=2)

# It is obvious, e.g. based on the strong blue continuum and absence of absorption line, that the stellar contribution
# from the host galaxy can be negligible.
[10]:
<matplotlib.legend.Legend at 0x7fb4563201c0>
_images/Tutorial_19_1.png
[11]:
#Let us try another example spectrum, e.g. G09_Y1_GS1_080.fit. now from the GAMMA database, which is
# a low-luminosity host-dominated AGN at z=0.05488.

# TIP: The spectrum is very noisy in the blue part, therefore we crop the spectrum to 3900-8000A wavelength range,
# to avoid the noise, which is impossible to reproduce with the PCA, but leaving the strong stellar absorption
# features below 4000 A, i.e., Ca II resonance lines H(3968) and K(3933).

g=read_gama_fits('G09_Y1_GS1_080.fit')
g.DeRedden()
g.CorRed()
g.crop(3900,8000)
g.fit_host_sdss()
_images/Tutorial_20_0.png
[12]:
# Let us save the above host, and try fitting with the masked emission lines.
g.host_no_mask=g.host
g.restore()
g.fit_host_sdss(mask_host=True, custom=False)
_images/Tutorial_21_0.png
[13]:
#Let's compare two host-fitting results.

plt.style.context(['nature', 'notebook'])
plt.figure(figsize=(12,6))
plt.plot(g.wave, g.flux+g.host, color="#929591", label='Obs GAMMA spectrum' , lw=1.5)
plt.plot(g.wave, g.host_no_mask, color="#F10C45", label='Host no mask', lw=2)
plt.plot(g.wave, g.host, color="green", label='Host masked em.line', lw=2)
plt.legend(loc='upper right',  prop={'size': 16}, frameon=False)


[13]:
<matplotlib.legend.Legend at 0x7fb4567bcdf0>
_images/Tutorial_22_1.png
  1. Creation of the predefined lists of emission lines

Within FANTASY there are some standard lists of AGN emission lines, such as: Hydrogen lines (e.g., Ha, Hb, Hg, Hd, Pa, Pb, Pg, Pd), Helium lines (both He I and He II), basic narrow lines ([OIII], [NII]), other AGN narrow lines, other AGN broad lines, Fe II lines, coronal lines, etc.

For Fe II line we identified several sets: Fe II model (Kovacevic et al. 2010, Shapovalova et al. 2012), list (feii_IZw1.csv) of specific I Zw 1 lines, identified by Kovacevic et al. (2010) template, which all originate from the same region. In case of very string Fe II emitters, such as NLSy1 galaxies, sometimes it is necessary to include additional transitions from forbidden Fe II (see Veron-Cetty et al. 2004)

The command create_input_folder(xmin,xmax,path_to_folder) creates a folder in the specified path, copies the predefined line lists within the specified wavelength range.

IMPORTANT: If the folder already exsists, it will be overwritten, which can be avoided by using the parameter overwrite=False.

TIP: Users can extend/shorten these line lists, as well as create completly new ones.

[18]:
create_input_folder(xmin=4000,xmax=8000, path_to_folder='liness/')
#create_input_folder(xmin=4000,xmax=8000, path_to_folder='lines/',overwrite=False)
Directory  liness/  Created
[19]:
# Crop function cuts the spectrum in a set wavelength range, e.g., if you want to fit just
# one emission line or a certain wavelength range

s.crop(4000, 8000)
print(s.wave) # simple examine of the wavelength range
[4000.8545 4001.775  4002.6953 ... 7695.817  7697.593  7699.3633]
  1. Defining the fitting model

Basic and simple model for AGN spectra fittings, should contain: - underlying continuum (broken power-law, power-law, streight line) - emission line (narrow, broad)

Depending on the wavelength range, spectral quality (S/N ratio, spectra resolution), and object type, the fitting model can be customized to contain many and complex emission features. One example for a typical type 1 AGN is given here.

Below we describe the fitting components: - continuum() - calls BrokenPowerlaw() model for a given breaking point (parameter refer indicate the breaking wavelength) in the form of:

\[\begin{split} f(\lambda) =\begin{cases} amplitude \cdot \left(\frac{\lambda}{refer}\right)^{index1} & \lambda > refer \\ amplitude \cdot \left(\frac{\lambda}{refer}\right)^{index1+index2} & \lambda < refer \end{cases}\end{split}\]

This means that the power law of index1 is fitting the part of the spectrum with wavelengths larger than the breaking point, and that of (index1 + index2) is fitting the part of the spectra with wavelengths smaller than than the breaking point; the continuum amplitude is then automatically set based on the flux value at the breaking point;

  • create_fixed_model - calls all lines from the indicated list have the same width and shift); user can specify common name for these lines (parameter name);

  • create_tide_model - calls all lines are having width and shift tided to the reference line (typically [OIII] 5007 line); user can specify common name for these lines (parameter prefix);

  • create_feii_model - calls for the FeII model, in which all iron lines have the same width and shift;

  • create_line - calls single line at the given wavelength (parameter position), with possibility to set width (fwhm), shift (offset) and intensity (ampl);

  • create_model - calls all lines from the indicated list, but no parameter is constrained;

For multi-line models, user can call a single line-list and many line-lists using the option files=[‘list1.csv’,’list2.csv’].

There are many options to limit the fitting parameters (as illustrated in the example below), such as max_fwhm set the maximum of the line width to a given value, min_offset fix the minimum line shift, etc.

[20]:
cont=continuum(s,min_refer=5690, refer=5700, max_refer=5710)
broad=create_fixed_model(['hydrogen.csv'], name='br')
he=create_fixed_model(['helium.csv'], name='he',fwhm=3000, min_fwhm=1000, max_fwhm=5000)
narrow=create_tied_model(name='OIII5007',files=['narrow_basic.csv','hydrogen.csv'],prefix='nr', fwhm=1000,min_offset=0, max_offset=300, min_fwhm=900, max_fwhm=1200,fix_oiii_ratio=True, position=5006.803341, included=True,min_amplitude=0.2)
fe=create_feii_model(name='feii', fwhm=1800, min_fwhm=1000, max_fwhm=2000, offset=0, min_offset=-3000, max_offset=3000)
#fe.amp_b4p.min=10 #An example how to force the amplitude of a selected FeII multiplet.
[21]:
# Code fits simultaneously all features.
model = cont+broad+narrow+fe+he
  1. Fittings

Command s.fit() perform fittings of the spectrum, which is based on Python Sherpa package, with above defined model. Code offers the possibility to set the number of consequent fittings (option ntrial) to increase the convergence, given with the chi2, which is listed.

[22]:
s.fit(model, ntrial=2)
stati 403048.00119953405
1 iter stat:  409.47219105855584
2 iter stat:  237.2725785790044

Congratulations, you are DONE!

  1. Plotting the results

You can plot the results using your favorite plotting style, an example is given below.

[23]:
plt.style.context(['nature', 'notebook'])
plt.figure(figsize=(18,8))
plt.plot(s.wave, s.flux, color="#929591", label='Obs', lw=2)
plt.plot(s.wave, model(s.wave), color="#F10C45",label='Model',lw=3)
plt.plot(s.wave, model(s.wave)-s.flux-70, '-',color="#929591", label='Residual', lw=2)
plt.axhline(y=-70, color='deepskyblue', linestyle='--', lw=2)

plt.plot(s.wave, cont(s.wave),'--',color="#042E60",label='Continuum', lw=3)
plt.plot(s.wave, narrow(s.wave),label='Narrow',color="#25A36F",lw=3)
plt.plot(s.wave, broad(s.wave), label='Broad H', lw=3, color="#2E5A88")
plt.plot(s.wave, he(s.wave), label='Broad He I', lw=3, color="orange")
plt.plot(s.wave, fe(s.wave),'-',color="#CB416B",label='Fe II model', lw=3)

plt.xlabel('Rest Wavelength',fontsize=20)
plt.ylabel('Flux',fontsize=20)
plt.xlim(4000,7000)
plt.ylim(-150,900)
plt.tick_params(which='both', direction="in")
plt.yticks(fontsize=20)
plt.xticks(np.arange(4000, 7000, step=500),fontsize=20)
plt.legend(loc='upper center',  prop={'size': 22}, frameon=False, ncol=2)

plt.savefig('fantasy_fit.pdf')
_images/Tutorial_38_0.png
  1. Inspecting/saving the fitting results

To examine the fitting results, simply use the command model, which will display all parameters for each emission line. Note that for complex model, for which the name/prefix has been indicated when defining the model.

[24]:
model
[24]:
<BinaryOpModel model instance '((((brokenpowerlaw + br) + (((((((((((((nr_OIII5007 + nr_OIII4958) + nr_NII6584) + nr_NIII6548) + nr_[OIII]_4363) + nr_[OI]_6300) + nr_[OI]_6363) + nr_[SII]_6716) + nr_[SII]_6730) + nr_[OII]_7330) + nr_Hd_4101) + nr_Hg_4340) + nr_Hb_4861) + nr_Ha_6562)) + feii) + he)'>
[25]:
# To see the fitting results, one can use standard outputs from Sherpa package, such as:
# gres - to list all fitting results
# gres.parnames - to list all parameters
# save_json() - to save the results

print(s.gres.format())
s.save_json() #saving parameters
Method                = levmar
Statistic             = chi2
Initial fit statistic = 14619.2
Final fit statistic   = 14381.9 at function evaluation 996
Data points           = 2844
Degrees of freedom    = 2792
Probability [Q-value] = 0
Reduced statistic     = 5.15113
Change in statistic   = 237.273
   brokenpowerlaw.refer   5710         +/- 40.6391
   brokenpowerlaw.ampl   130.357      +/- 1.12651
   brokenpowerlaw.index1   -1.35925     +/- 0.00765496
   brokenpowerlaw.index2   0.276544     +/- 0.0119874
   br.amp_Hd_4102   4.60536      +/- 1.14993
   br.amp_Hg_4340   36.8139      +/- 2.23026
   br.amp_Hb_4861   92.1445      +/- 0.808518
   br.amp_Ha_6563   156.895      +/- 1.46635
   br.offs_kms    228.142      +/- 9.24466
   br.fwhm        5880.9       +/- 29.2334
   nr_OIII5007.ampl   102.981      +/- 1.0002
   nr_OIII5007.offs_kms   0            +/- 3.03093
   nr_OIII5007.fwhm   1200         +/- 8.2344
   nr_NII6584.ampl   273.676      +/- 4.23755
   nr_[OIII]_4363.ampl   17.4238      +/- 2.1886
   nr_[OI]_6300.ampl   12.6055      +/- 0.736598
   nr_[OI]_6363.ampl   10.8107      +/- 0.795209
   nr_[SII]_6716.ampl   10.3531      +/- 0.894605
   nr_[SII]_6730.ampl   13.3024      +/- 0.869021
   nr_[OII]_7330.ampl   8.26561      +/- 0.900323
   nr_Hd_4101.ampl   31.4485      +/- 1.6448
   nr_Hg_4340.ampl   58.8953      +/- 2.38438
   nr_Hb_4861.ampl   168.875      +/- 2.04169
   nr_Ha_6562.ampl   500          +/- 4.42197
   feii.amp_a2D2   8.12698      +/- 1.12333
   feii.amp_a4G   40.6947      +/- 0.464262
   feii.amp_a4H   14.2896      +/- 1.38531
   feii.amp_a6S   27.974       +/- 0.450547
   feii.amp_b2H   10.8641      +/- 0.650544
   feii.amp_b4D   10.0885      +/- 0.704465
   feii.amp_b4F   24.7389      +/- 0.343631
   feii.amp_b4P   17.3997      +/- 0.897384
   feii.amp_e4D   299.448      +/- 26.893
   feii.amp_e6D   2            +/- 0
   feii.amp_z4D   16.7204      +/- 0.489071
   feii.amp_z4F   56.6395      +/- 5.32144
   feii.amp_z4P   3.97033      +/- 0.497694
   feii.amp_y4G   4.99779      +/- 0.87021
   feii.amp_b4G   5.37724      +/- 0.488185
   feii.amp_x4D   16.9158      +/- 0.355557
   feii.amp_y4P   20.3782      +/- 0.61299
   feii.offs_kms   311.357      +/- 10.579
   feii.fwhm      2000         +/- 17.7577
   he.amp_HeI_4144   0            +/- 0.806795
   he.amp_HeI_4471   16.2932      +/- 1.0176
   he.amp_HeII_4686   10.7961      +/- 0.607136
   he.amp_HeI_5877   12.0898      +/- 0.451236
   he.amp_HeI_7065   0            +/- 0.518887
   he.amp_HeI_7281   9.96514      +/- 0.470948
   he.amp_HeI_7816   127.359      +/- 194.424
   he.offs_kms    263.368      +/- 43.4104
   he.fwhm        3535.06      +/- 116.028
WARNING: parameter value brokenpowerlaw.refer is at its maximum boundary 5710.0
WARNING: parameter value nr_OIII5007.offs_kms is at its minimum boundary 0.0
WARNING: parameter value nr_OIII5007.fwhm is at its maximum boundary 1200.0
WARNING: parameter value nr_Ha_6562.ampl is at its maximum boundary 500.0
WARNING: parameter value feii.fwhm is at its maximum boundary 2000.0
WARNING: parameter value he.amp_HeI_4144 is at its minimum boundary 0.0
WARNING: parameter value he.amp_HeI_7065 is at its minimum boundary 0.0
  1. Simple analysis

Here we demonstrate how to measure the fluxes of a model component.

[30]:
# Integrate total FeII model,
flux_feII=np.sum(fe(s.wave))
print("FeII total flux=",flux_feII)

# Mask the wavelength range of interess (e.g. Ha line) and integrate broad component.
x=s.wave
mask_ha=(x>6300)&(x<6700)
Ha_broad=np.sum(broad(s.wave)[mask_ha])
print("Ha_broad=",Ha_broad)

FeII total flux= 23647.127
Ha_broad= 14127.08

References

  1. Rakic, N., 2022, MNRAS, 516, 1624

  2. Ilic, D. et al. 2020, A&A, 638, 13

  3. Kovacevic, J. et al. 2010, ApJS, 189, 15

  4. Shapovalova, A. I. et al. 2012, ApJS, 202, 10

  5. Ilic, D., Rakic, N., Popovic, L. C., 2023, ApJS, in press