I admit it. I’m not on facebook. And I’ve been holding off tweeting as well. Sometimes I think the associations with all things facebook and twitter is like a virus. Do I really want to follow brand X on facebook? Really? But I’ve decided Twitter in particular can serve a very useful role in bringing attention to news, events, research papers, and web-sites dedicated to astronomical data-mining. I can’t tell you how many times I’ve come across a story that I wanted to share on this site but didn’t really have the time to write about it. So I’m going to continue to try to make AstroDataMining.net a useful resource for those interested in learning about this emerging new field. I think there is a significant opportunity and need for more fellow computer scientists to become involved in the field. But at the same time, I’ve started a twitter account to address those frequent occasions when some new development occurs in Astronomical Data Mining. SO… Follow Data-Mining for Astronomy on Twitter!
Perspectives | The growing importance of data curation
August 29, 2011Excerpt:
With the world awash in information, curating all the scientifically relevant bits and bytes is an important task, especially given digital data’s increasing importance as the raw materials for new scientific discoveries, an expert in information science at the University of Illinois says. Carole L. Palmer, a professor of library and information science, says that data curation — the active and ongoing management of data through their lifecycle of interest to science — is now understood to be an important part of supporting and advancing research….The Center for Informatics Research in Science and Scholarship at Illinois will receive about $2.9 million as a partner on the Data Conservancy project, a $20 million initiative led by Sayeed Choudhury at the Johns Hopkins University Sheridan Libraries. The five-year award, one of the first two in the National Science Foundation’s DataNet program, will fund developing infrastructure for the management of the ever-increasing amounts of digital research data.
Ref: University of Illinois at Urbana-Champaign. “Deluge of scientific data needs to be curated for long-term use.” ScienceDaily, 25 Feb. 2010. Web. 29 Aug. 2011.
Papers | Galaxy Zoo morphology improves photometric redshifts in the Sloan Digital Sky Survey
May 5, 2011Submitted by Dr. Michael J. Way, NASA Goddard Institute for Space Studies.
Given the recent release of the Galaxy Zoo Data Release 1 researchers can begin to explore the myriad ways that one can use the most accurate and numerous database of galaxy morphology ever compiled. To that end we have used galaxy photometry and redshift information from the Sloan Digital Sky Survey in combination with precise knowledge of galaxy morphology via the Galaxy Zoo project to calculate photometric redshifts using Gaussian process regression.
We are primarily interested in obtaining accurate photometric redshifts for a subset of SDSS galaxies called the luminous red galaxies. These galaxies are normally found in denser regions of the local universe. They are interesting because they tend to be accurate tracers of the large scale structure in the universe and have been used for measuring the Baryonic Acoustic Oscillation signal thus putting better constraints on present day cosmological models.
The Galaxy Zoo database is used to segregate the elliptical galaxies from the spirals (we focus on the former). Then we obtain a variety of derived primary and secondary isophotal shape estimates from the Sloan Digital Sky Survey imaging catalog (e.g. the amount of light within the 50% Petrosian radius). Using these shape estimates in combination with the five bandpass photometry of elliptical galaxies with redshifts from the SDSS we using a non-linear regression training set method (Gaussian process regression) to estimate their photometric redshifts. The root mean square error for luminous red galaxies classified as ellipticals is as low as 0.0118 which is nearly a factor of 2 lower than typical estimates for galaxies in the SDSS (See Figure).

One can see in the lower left panel that estimates of the photometric redshift are lowest for the luminous red galaxies classified as ellipticals. The best results are obtained when using their 5-band photometry and a variety of isophotal shape estimates denoted as B. See the paper on arXiv for more details.
The next step will be to use classification techniques from the Machine Learning literature to classify all of the elliptical galaxies in the ~350 million object database of the SDSS. This has already been attempted by one group using the ~900,000 Galaxy Zoo morphologies and isophotal shape estimates as training samples. One would expect to be able to classify approximately 50-100 million luminous red galaxies as ellipticals. These in turn can be used as the most accurate probes thus far in estimating Baryonic Acoustic Oscillations at unprecedented depth.
Surveys | RAVE: The Radial Velocity Experiment
May 5, 2011Editor’s note: The following was kindly submitted by Dr. Arnaud Siebert of the Centre de DonnĂ©s de Strasbourg (CDS) and the Observatoire Astronomique de Stasbourg. RAVE has just made public its third data release, as described in greater detail in a paper available on the arXiv, to appear in the Astrophysical Journal.

Heliocentric radial velocity of stars measured by RAVE projected on to the night sky. The smooth change in color (radial velocity) is due to the motion of the Sun around our galaxy.
The RAVE (RAdial Velocity Experiment) project is a multi-fiber spectroscopic survey of stars in the Milky Way using the 1.2-m UK Schmidt Telescope of the Anglo-Australian Observatory (AAO). The RAVE collaboration consists of researchers from over 20 institutions around the world and is coordinated by the Astrophysical Institute Potsdam.
As a southern hemisphere survey covering 20,000 square degrees of the sky, RAVE’s primary aim is to derive the radial velocity of stars from the observed spectra. Additional information is also derived such as effective temperature, surface gravity, metallicity, photometric parallax and elemental abundance data for the stars. The survey represents a giant leap forward in our understanding of our own Milky Way galaxy; with RAVE’s vast stellar kinematic database the structure, formation and evolution of our Galaxy can be studied.
Beginning in 2003, RAVE had obtained 465,000 observations of stars to the end of 2010. It is expected to run to the end of 2012. In April 2011 RAVE released its third catalog containing more than 80,000 radial velocity measurements and atmospheric parameters for nearly 40,000 stars. A full description of the project can be found on the RAVE project website.
Posted by John Rachlin