Simplifying Development and Deployment of High-Performance, Reliable Distributed Systems
Distributed systems form the foundation of our society’s infrastructure. Unfortunately, they suffer from a number of problems:
they are time-consuming to develop because it is difficult for the programmer to envision all possible deployment environments and design adaptation mechanisms that will achieve high performance in all scenarios;
their code is complex due to the numerous outcomes that have to be accounted for at development time and the need to reimplement state and network models;
they are unreliable because of the difficulties of programming a system that runs over an asynchronous network and handles all possible failure scenarios.
If left unchecked, these problems will keep plaguing existing systems and hinder development of a new generation of distributed services. A key set of new services arises in cloud computing.
We propose a radically new approach to simplifying development and deployment of high-performance, reliable distributed systems. The key insight is in creating a new programming model and architecture that leverages the increases in per-node computational power, bandwidth and storage to achieve this goal. Instead of resolving difficult deployment choices at coding time, the programmer merely specifies the choices and the objectives that should be satisfied. The PROPHET runtime then resolves the choices during live execution so as to maximize the objectives. To accomplish this task, the runtime uses a combination of state-space exploration, simulation, behavior prediction, performance modeling, and program steering.
We graciously acknowledge the funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement 259110. This grant provides 1.45 Million EUR over five years (2011-2016).