|
|
|
VOClass: Classification and Learning in the Virtual Observatory
|
|
| Classification and outlier detection are fundamental to all areas of the observational sciences. In astrophysics the interrelations between observed parameters can give insight into the physical processes within the universe. Deviations from these relations can be used to identify new classes of source or astrophysical phenomena. We propose here to develop an automated classification facility for the VO, VOClass, that will integrate state-of-the-art data-mining algorithms with an extensive set of VO-enabled tools that are now available. Classifications will be undertaken utilizing a range of supervised and unsupervised techniques that are designed to scale to the size of current and future surveys. Using VOClass, astronomers will be able to upload images or catalogs, correlate their sources with other multifrequency data sets available through the VO or NASA data archives and return classifications of these sources and identifications of unusual objects based on the measured and correlated properties.
The primary goals of VOClass are: (a) to enable fast classification of large, distributed databases, (b) to provide robust statistical descriptions of the interrelation between observables that scale to the size and dimensionality of current and future data sets, (c) to integrate classification and visualization tools available through the Virtual Observatory, (d) to provide the ability to adaptively classify sources by learning from user defined inputs and thereby develop new classification schemes, (e) to interface these knowledge discovery tools with the Virtual Observatory through the application of Web Services which will provide a modular access enabling scientists to build other classifiers that are optimized to a particular task.
VOClass will provide a comprehensive facility for source extraction, cross-matching, and classification. Combining classification with state-of-the-art VO tools it will provide an environment for distributed analysis and knowledge discovery that is unprecedented in astronomy. It has, however, many other applications both within astrophysics and in the context of the broader NASA mission. Development of VOClass will directly address the AISR program goal of providing "advances in information science and technology to increase life cycle effectiveness and efficiency of the Science Mission Directorate (SMD)". Beyond astrophysics the techniques we propose to deliver have applications to telemetry data streams, quick look observations and verification of data from satellite images. |
|
|