Section: Overall Objectives
Structure of the team
Our ambitious goal is to provide the algorithmic foundations of large-scale dynamic distributed systems, ranging from abstractions to real deployment. This is reflected in two major research themes: Distributed computing models and abstractions, and User-centric distributed systems and applications.
Models and abstractions for large-scale distributed computing
A very relevant challenge (maybe a Holy Grail) lies in the definition of a computation model appropriate to dynamic systems. This is a fundamental question. As an example there are a lot of peer-to-peer protocols but none of them is formally defined with respect to an underlying computing model. Similarly to the work of Lamport on “static” systems, a model has to be defined for dynamic systems. This theoretical research is a necessary condition if one wants to understand the behavior of these systems. As the aim of a theory is to codify knowledge in order it can be transmitted, the definition of a realistic model for dynamic systems is inescapable whatever the aim we have in mind, be it teaching, research or engineering.
Distributed computability
Among the fundamental theoretical results of distributed computing, there is a list of problems (e.g., consensus or non-blocking atomic commit) that have been proved to have no deterministic solution in asynchronous distributed computing systems prone to failures. In order such a problem to become solvable in an asynchronous distributed system, that system has to be enriched with an appropriate oracle (also called failure detector). We have been deeply involved in this research and designed optimal consensus algorithms suited to different kind of oracles. This line of research paves the way to rank the distributed computing problems according to the “power” of the additional oracle they required (think of “additional oracle” as “additional assumptions”). The ultimate goal would be the statement of a distributed computing hierarchy, according to the minimal assumptions needed to solve distributed computing problems (similarly to the Chomsky's hierarchy that ranks problems/languages according to the type of automaton they need to be solved).
Distributed computing abstractions
Major advances in sequential computing came from machine-independent data abstractions such as sets, records, etc., control abstractions such as while, if, etc., and modular constructs such as functions and procedures. Today, we can no longer envisage not to use these abstractions. In the “static” distributed computing field, some abstractions have been promoted and proved to be useful. Reliable broadcast, consensus, interactive consistency are some examples of such abstractions. These abstractions have well-defined specifications. There are both a lot of theoretical results on them (mainly decidability and lower bounds), and numerous implementations. There is no such equivalent for dynamic distributed systems.
User-centric fully-decentralized architectures
The Web is now centered around users. A variety of applications ranging from social networks to recommendation systems have changed the way users interact with information. Websites that used to resemble read-only data repositories have turned into fully read-write platforms in which users play a major role. Important news are continuously debated on Twitter. Facebook has become a primary means for political propaganda and protest. This is causing the amount of information available on the Web to grow by the second.
Existing paradigms for information retrieval have a hard time keeping up. Only a small portion of the web can effectively be indexed by search engines, and a lot of this information is only relevant to relatively small communities. This is particularly true when dealing with short-lived information such as news. Real-time indexing is almost impossible and even large companies resort to clustering users in an effort to provide seemingly personalized services, but without being able to harness the entire wealth of available information.
Peer-to-peer meets data mining.
To tackle the challenges posed by the Web 2.0, we are combining our expertise on large scale peer-to-peer overlays with techniques from the data-mining community. Historically, ASAP has devoted significant efforts to the design of peer-to-peer systems and overlays that directly take into account application characteristics. Within the context of Anne-Marie Kermarrec's Gossple ERC grant, this task has evolved into the design of overlays and systems that gather not only network devices but also users and application objects. These become themselves peers (although obviously hosted on a physical computing entity), and their data directly influences the overlay links through similarity metrics or classification techniques.
Focus on real-world applications.
Beyond the definition of models and peer-to-peer architecture, our ambitious goal is to target real-world applications. This requires focus on technical aspects such as personalization and privacy as well as significant engineering efforts to make applications usable in a real environment. This has been reflected in several activities and of the team and has resulted, for example, in software prototypes and platforms to facilitate the design of distributed applications. The goal within this context is the involvement of users. Evaluating technologies for the social web is only possible in the presence of the Web's social components: users themselves.