Research Project

Crossmatching Astronomy Catalogs

Probabilistic cross-identification of astronomy sources across observations, instruments and telescopes is at the heart of multi-spectral and time-domain analyses.

Astronomy has entered a new era of surveys. Tenaciously observing the night sky, dedicated telescopes provide a huge amount of data every day. Sometimes we want to see in multiple wavelength ranges, and at other times, to look for changes on various timescales. These independent observations can be completely different from one another in terms of their measured properties. The sky, however, provides a fixed reference frame, and the positions of the sources serve as adequate handles to their identities when measured accurately. Thus, cross-identification methods primarily rely on the positions of the detections. One frequently used traditional method is to associate the closest detections. When using such pragmatic approaches, various conceptual problems emerge. For example, what to do if the astrometric uncertainty varies from object to object? How to include other measurements in addition to the positions? Or how to generalize to more than just two datasets?

Our probabilistic framework is built on Bayesian hypothesis testing to see if the detections are consistent with belonging to a single object at an unknown position on the sky. To a first approximation, the problem is analytically tractable for Gauss and Fisher likelihoods, which in the limit of high accuracies provide the identical answers. For example, we can consider two catalogs. The directional uncertainties of the detections, denoted by $\sigma_1$ and $\sigma_2$, determine the maximum value of the Bayes factor and the falloff rate as a function of the observed angular separation $\psi$

$B = \frac{2}{\sigma^2_1 + \sigma^2_2} \exp \left\{-\frac{\psi^2}{2(\sigma^2_1 + \sigma^2_2)} \right\}$

Various extensions have been developed that consider, for example, stars that move fast on the sky at an unknown velocity or extended objects with non-trivial shapes. This is an active research area that combines several branches of statistics and computational science with geometric and physical modeling.

Invited article on Probabilistic Record Linkage in Astronomy to appear in the Annual Review of Statistics and Its Applications (Budavári & Loredo, 2015 in press)


Back to top