|
|
|
Directed Exploration of Complex Systems
|
|
| We will mature and broaden the applicability of a directed exploration capability designed to maximize scientific return from
large-scale numerical simulators. Numerical simulators provide scientists with a valuable tool for examining massively complex
systems that would be impossible to study otherwise. However, in many simulations, long run-times make a detailed, exhaustive study
of the ``input parameter landscape'' infeasible. The key idea we are advancing uses support vector machine (SVM) classification and
regression techniques along with active learning to cleverly explore input parameter space, as a means of improving the speed and
efficiency with which a set of simulations can provide scientifically useful results. In addition, Markov Chain Monte Carlo (MCMC)
sampling techniques will be introduced to aid in the exploration and visualization of these potentially high-dimensional input
parameter spaces. In proof-of-concept studies to date (including an AISR-funded seed project), we have successfully applied our
approach to two large-scale scientific simulations: (1) a smooth particle hydrodynamic (SPH)/N-body gravitational simulation of
asteroid collisions to narrow plausible initial conditions (impactor size, velocity, etc.) for production of asteroid satellites and for
generation of specific asteroid families (Emma, Karin, Baptistina) and (2) magnetospheric inversions driven by in-situ observational
data from the IMAGE spacecraft. Having demonstrated the basic feasibility and utility of the concept (speedups from 2-fold to 100-fold
while achieving comparable fidelity to exhaustive grid sampling), we seek to mature and broaden the approach so that it can be applied
to a variety of scientific investigations. At the recent Second NASA Data Mining Workshop, numerous scientists voiced concerns that
data mining outputs are rarely linked back to the underlying physical system and processes. A major strength of this proposal is that it
directly links data mining with models of physical systems, which are currently being used in projects at the cutting edge of scientific
research, and aims at increasing knowledge of how the underlying parameter spaces affect observable quantities. Another significant
strength is that we team domain experts (scientists who are expert in particular simulators) with computer scientists who can bring to
bear cutting edge research in computing techniques and technology. This work is intended to be applicable to a broad variety of
physical systems modeled with simulators. Our approach requires no modifications to the internals of a particular simulator; the
method can be deployed simply by writing glue code that enables the simulator to be called by the directed exploration capability and
providing one or more grading scripts that process the raw output of the simulator into the scientific quantities of interest for the
investigation. The primary asteroid collision applications directly address NASA strategic subgoal 3.C.1: ``Progress in learning how the
Sun's family of planets and minor bodies originated and evolved.'' Directed exploration fits well with several of the Applied
Information Systems Program goals, including developing and deploying tools to amplify the productivity of scientific users of
high-end computing resources and to increase science return, including enabling qualitatively new science through information science
and technology. Our collaboration has been successful in the past in bringing low-TRL concepts through AISR (and similar programs)
into deployment in NASA science research and analysis programs. |
|
|