|
|
|
A High Performance Computing Framework for Cosmic Microwave Background Data Management
|
|
| The Cosmic Microwave Background (CMB) comprises the earliest possible photon-image of the Universe. Tiny anisotropies in the
CMB temperature and polarization encode a wealth of information, and offer a unique window onto cosmology and ultra-high energy
physics. However the faintness of the CMB signals means that extremely large data sets must be gathered and reduced to detect them
above the instrument noise. Moreover since we want to measure the correlations in the CMB signal across the celestial sphere, and
have to extract them from data whose noise is temporally correlated, simple divide-and-conquer approaches are insufficient and the
enormous CMB data sets have to be treated as single data objects. The size and complexity of CMB data sets has resulted in high
performance computing (HPC) becoming the cornerstone of CMB research. This proposal focuses on the challenges inherent in
managing and manipulating very large volumes of CMB data in an HPC environment, and in particular the need to optimize data flow,
from archival storage to spinning disk, and from disk to memory. These phases are treated separately, with the division reflecting both
their distinct issues and the cost-structure of national HPC facilities. Runtime data management is addressed with a data input
abstraction layer, while pre/post-processing data management is built around a database driven set of put and fetch tools tuned to an
HPC environment. This project seeks to improve the design and operation of CMB missions, and the analysis of their data, by enabling
the efficient and effective use of limited national HPC resources, including forthcoming peta-scale systems. As such it supports NASA's
goals of advancing scientific knowledge and understanding the Universe. |
|
|