|
|
|
Robust Grid Computing using Peer-to-Peer Services
|
|
| We propose to build a scalable infrastructure for executing grid
applications on a distributed set of resources. Such infrastructure must
be decentralized, robust, highly available, and secure, while efficiently
mapping applications to available resources throughout the system.
Fortunately, these are precisely the characteristics promised by new
techniques and approaches in peer-to-peer (P2P) systems research.
Our system is composed of a relatively loosely-coupled set of distributed,
cooperating users (peers). All peers contribute resources to an ad-hoc
resource pool, and all peers can submit jobs that are executed using
available resources. The overall system, from the point of view of a user,
can be thought of as a combination of a centralized, Condor-like Grid
system for submitting and running arbitrary jobs, and a system such as
SETI@home for farming out jobs from a server to be run on a (potentially
very large) collection of machines in a completely distributed environment.
Such a P2P grid system will have clear advantages over the current state of
the art platforms for so-called ``desktop grid computing'', which are based
on a client-server architecture. In these systems, a trusted server
supplies jobs to a set of client machines distributed across the Internet.
The server controls job placement, and also
collects all results. The server must, therefore, maintain state
for all jobs in the system, and the entire resource pool is rendered
useless if the server fails. The server can also become a performance bottleneck.
The proposal addresses the crucial problem of job placement in a
completely decentralized manner. We describe a
distributed algorithm for submitting jobs and efficiently matching them to
available resources. We also address issues related to security and the
forming of separate ad-hoc networks for different communities of users. We
use P2P techniques for both load-balance and for resilience; from our
preliminary analysis and simulations, we expect our scheme to scale with system size
and to be robust against peer failures and departures.
The proposed work is a collaborative effort between computer scientists and
astronomers. We will design, implement, and validate the
system using a set of problems in computational astronomy, including ones
that are directly relevant to the Deep Impact mission. Measuring
performance and observing the behavior of the algorithms in a
realistic environment will validate the usefulness of basic peer-to-peer
services in the grid context.
Applications that are suited for the system have both large
computational requirements and low I/O requirements. We have identified
several problems with these characteristics from the computational
astronomy domain, including several relevant to data analysis and
theoretical modeling for the Deep Impact Mission. For example, the detailed chemical network modeling
required to match the observations involves a large parameter space search
that is ideally suited for the proposed P2P scheme. Currently there are
insufficient computational resources available to the Deep Impact team to
carry out this search in a practical fashion using a dedicated brute-force
approach. The proposed system will enable employing unused cycles on any
machines made available by collaborators and other willing participants,
without significant administrative burden or security concerns. |
|
|