Friday 30 May 2014

FITS, not so fit

In my previous post, I introduced the FITS format, which can be used to store spectral data. One of the aim of my GSoc project is to implement a reader for this format. The FITS format can hold a very diverse range of information. Also, this format has been around for more than 30 years and has evolved a lot in that time. The basic keywords in this format are well defined, but the new keywords for storing spectral data are not. This has made the format really messy. There are different keywords that imply the same thing as well as same keywords that imply different things. The documentation of this format is incomplete, there are minimal examples and no concrete definition to make things less ambiguous.

To tackle this issue, we decided to tackle the format, one specification at a time. The first target is long slit, multi-object or echelle spectrographs (one-dimensional). The reader being implemented follows the Image Reduction and Analysis Facility(IRAF) standard described here. Basically, the FITS files can store mappings from pixel coordinates to wavelengths (or dispersion). These mappings are defined using functions. To save space, a lot of these one dimensional mappings are stored together in one FITS file. For each sequence of coordinates, the mapping function is defined in the header. This kind of format is called multispec. There are eight possible dispersion function types that can be defined in multispec format:

  1. Linear
  2. Log-Linear
  3. Legendre polynomial
  4. Chebyshev polynomial
  5. Linear Spline
  6. Cubic Spline
  7. Pixel array
  8. Sampled array
The first three were already implemented in the specutils package. I added the functionality for the fourth function and now I am currently implementing linear and cubic splines. I have been searching everywhere for files that store data in this format, but nothing has been found. Still, for completeness I am trying to decipher the format specification described, and implement a reader in case someone has these files. If you do you files for these formats, please get in touch with us!

Understanding these mathematical models has been very challenging, but fulfilling and interesting. Hopefully, with the contribution of developers, the reader will be ready by the midterms! Until next time...

Sunday 18 May 2014

Google summer of code - Revising my passion

I clearly remember my pre-university days, when I was a member of an non-profit organisation called S.P.A.C.E in India. They were educating students from all over India on astronomy. Their way of education was most interesting. I was part of their practical sessions, including star-gazing sessions from remote parts of India and experiments to measure the circumference of the earth. They also provided their students with kits which included a magic stick to measure shadows, a sky map and a small telescope as well! I also won a nation-wide contest organised by them, which led me to be a part of the team to observe the total solar eclipse in Turkey in 2006. As I entered my university, I got lost into Computer Science and slowly my interest in astronomy dwindled. As my university life almost comes to an end, I have got the opportunity to rekindle my passion via the Google Summer of Code program. I will be contributing to Astropy under the umbrella of Python Software Foundation

Astropy is a software package written in Python intended to assist astronomy related computations. I will be contributing to the specutils package. Spectral data has been collected over centuries in various formats. This package will enable reading of spectral data into easy to use data structures, manipulation of the data using utility functions and writing of the data into various formats. The most common format for storing spectral data is FITS (Flexible Image Transport System). This format is endorsed by both NASA and IAU. Supporting this format will be a major goal of this project. There are various spectral mappings that can be defined in a FITS file. These mappings define some functions, which give information on the wavelength (or energy or frequency) at a particular point in the data array. I will go into details of these mappings and how they are defined with FITS in subsequent blog posts.


Officially coding starts from tomorrow, 19th May 2014. I am looking forward to a successful project!