Papers | Galaxy Zoo morphology improves photometric redshifts in the Sloan Digital Sky Survey

May 5, 2011

Submitted by Dr. Michael J. Way, NASA Goddard Institute for Space Studies.

Given the recent release of the Galaxy Zoo Data Release 1 researchers can begin to explore the myriad ways that one can use the most accurate and numerous database of galaxy morphology ever compiled. To that end we have used galaxy photometry and redshift information from the Sloan Digital Sky Survey in combination with precise knowledge of galaxy morphology via the Galaxy Zoo project to calculate photometric redshifts using Gaussian process regression.

We are primarily interested in obtaining accurate photometric redshifts for a subset of SDSS galaxies called the luminous red galaxies. These galaxies are normally found in denser regions of the local universe. They are interesting because they tend to be accurate tracers of the large scale structure in the universe and have been used for measuring the Baryonic Acoustic Oscillation signal thus putting better constraints on present day cosmological models.

The Galaxy Zoo database is used to segregate the elliptical galaxies from the spirals (we focus on the former). Then we obtain a variety of derived primary and secondary isophotal shape estimates from the Sloan Digital Sky Survey imaging catalog (e.g. the amount of light within the 50% Petrosian radius). Using these shape estimates in combination with the five bandpass photometry of elliptical galaxies with redshifts from the SDSS we using a non-linear regression training set method (Gaussian process regression) to estimate their photometric redshifts.  The root mean square error for luminous red galaxies classified as ellipticals is as low as 0.0118 which is nearly a factor of 2 lower than typical estimates for galaxies in the SDSS (See Figure).

One can see in the lower left panel that estimates of the photometric redshift are lowest for the luminous red galaxies classified as ellipticals. The best results are obtained when using their 5-band photometry and a variety of isophotal shape estimates denoted as B. See the paper on arXiv for more details.

The next step will be to use classification techniques from the Machine Learning literature to classify all of the elliptical galaxies in the ~350 million object database of the SDSS. This has already been attempted by one group using the ~900,000 Galaxy Zoo morphologies and isophotal shape estimates as training samples. One would expect to be  able to classify approximately 50-100 million luminous red galaxies as ellipticals. These in turn can be used as the most accurate probes thus far in estimating Baryonic Acoustic Oscillations at unprecedented depth.


Surveys | RAVE: The Radial Velocity Experiment

May 5, 2011

Editor’s note: The following was kindly submitted by Dr. Arnaud Siebert of the Centre de Donnés de Strasbourg (CDS) and the Observatoire Astronomique de Stasbourg. RAVE has just made public its third data release, as described in greater detail in a paper available on the arXiv, to appear in the Astrophysical Journal.

Heliocentric radial velocity of stars measured by RAVE projected on to the night sky. The smooth change in color (radial velocity) is due to the motion of the Sun around our galaxy.

The RAVE (RAdial Velocity Experiment) project is a multi-fiber spectroscopic survey of stars in the Milky Way using the 1.2-m UK Schmidt Telescope of the Anglo-Australian Observatory (AAO). The RAVE collaboration consists of researchers from over 20 institutions around the world and is coordinated by the Astrophysical Institute Potsdam.

As a southern hemisphere survey covering 20,000 square degrees of the sky, RAVE’s primary aim is to derive the radial velocity of stars from the observed spectra. Additional information is also derived such as effective temperature, surface gravity, metallicity, photometric parallax and elemental abundance data for the stars. The survey represents a giant leap forward in our understanding of our own Milky Way galaxy; with RAVE’s vast stellar kinematic database the structure, formation and evolution of our Galaxy can be studied.

Beginning in 2003, RAVE had obtained 465,000 observations of stars to the end of 2010. It is expected to run to the end of 2012. In April 2011 RAVE released its third catalog containing more than 80,000 radial velocity measurements and atmospheric parameters for nearly 40,000 stars. A full description of the project can be found on the RAVE project website.


Papers | QSO Selection Algorithm Using Time Variability and Machine Learning: Selection of 1,620 QSO Candidates from MACHO LMC Database

May 1, 2011

This research synopsis was submitted by Dae-Won Kim, Harvard-Smithsonian Center for Astrophysics, Cambridge, MA, USA. A preprint of this paper, which is to appear in the Astrophysical Journal, is available on the arXiv preprint server.

Modern astronomy is entering into the completely new era driven by immense amount of observational data. For instance, ongoing and future large-scale surveys such as Pan-STARRS and LSST will produce more than several terabytes of data per night. Wide-field data mapping of the sky will open a new paradigm of astronomy not only in both scientific and data-handling aspects. Especially, it will be practically impossible to manually examine all the data in order to discover scientifically meaningful information. In other words, innovative and novel algorithms that can automatically analyze the data with minimal human intervention, and that can deliver only the meaningful information to astronomers are becoming more and more important.

This paper introduced such an algorithm to select QSOs (Quasi-Stellar Objects) that typically show strong non-periodic or pseudo-periodic variability. In the absence of spectroscopic data, such an algorithm will be a very powerful tool to select QSOs. Especially, for the future large-scale surveys (Pan-STARRS and LSST), spectroscopic observation will be very expensive due to their wide field of views and limiting magnitudes.

We introduced 11 time series features that quantify different variability characteristics of light curves, which was confirmed to be practical to separate QSOs from other types of variable stars (e.g. Cepheids, RR Lyraes, eclipsing binaries, Be stars, micro-lensing, long-period variables, etc.) and non-varying stars. Figure 1 shows an example of the scatter plot of two time series features. As the figure shows, the two features are useful to separate each of the different types of variable stars. We then employed a supervised machine learning technique called `Support Vector Machine’ that can train a classification model in any hyper dimension. We claim that using hyper-plane cuts derived on the basis of the 11-D space (i.e. 11 time series feature space) is much more adequate to separate QSOs rather than using conventional 2-D hard cut.

We applied the algorithm to the MACHO database consisting of 40million light curves and found 1,620 QSO candidates. We then used the Harvard Odyssey cluster to analyze the whole dataset, which took about two days. Identified candidates were cross-matched with mid-IR catalogs and X-ray catalogs, and confirmed that the majority of their candidates are very strong QSO candidates.

Figure 1. Scatter plot of two time series features. Each axis is different time series feature. Different symbols and colors are different variable sources (gray dots: non-variables, black x’s: eclipsing binaries, magenta crosses: micro-lensing, yellow x’s: RR Lyraes, green x’s: Cepheids, cyan crosses: long-period variables, blue crosses: Be stars, red squares: QSOs). Most of the different variable types are grouped in the different regions.”


Papers | Synthetic Milky Way Galaxies

January 24, 2011
Artist's conception of the Milky Way galaxy.

Image via Wikipedia

Future surveys such as the LSST and GAIA will create object catalogs of staggering size. But beyond such qualitative statements, how do astronomers anticipate the scientific return on these projects? To some extent, you can extrapolate on past surveys taking into account anticpated advances in instrumentation. “Survey X imaged to magnitude M over P percent of the sky. Survey X’ will image to magnitude M’ over P’ percent of the sky, therefore….” But there are really a great number of variables to consider. In order to anticipate the productivity and capabilities of future surveys, one approach is to generate synthetic astrometric catalogs based upon models for the density and distribution of stars throughout the galaxy. As explained in a recent paper by Bland-Hawthorn, Johnston, and Binney (“Galaxia: A Code to Generate a Synthetic Survey of the Milky Way“), such synthetic catalogs are useful for:

a. Interpreting observational data
b. Testing theories on which the models are based, and
c. Testing the capabilities of different instruments and for defining strategies to reduce measurement errors (BJB, 2011).

Using a program that the authors developed, known as Galaxia, the authors implemented a complex model of the “stellar content” of the Galaxy as a function of position, velocity, age, metallicity, and mass. Different components of the Milky Way (thin/thick disc, stellar halo, galactic bulge) are modeled separately.

A fragment of the complex model encoded within Galaxia

In order to consider how future surveys might perform, you also have to take into account factors such as extinction due to interstellar dust, which itself requires a 3D model for the distribution of dust in the galaxy. The figure below shows an impressive correlation with observations obtained between Hipparcos observations and those that would be anticipated based upon the models encoded within Galaxia.


Follow

Get every new post delivered to your Inbox.