Section: New Results
Scaling Methods
Participants : Davit Martirosyan, Philippe Robert, Wen Sun.
Large Unreliable Stochastic Networks
The reliability of a large distributed system is studied. The framework is a system where files are stored on servers. When one of these servers breaks down, all files on it are lost. We assume that these files could be retrieved immediately and re-allocated among other servers while the failed server restarts but empty. It is a reasonable assumption since the failure rate is quit small comparing to an effective recovery mechanism. It is also assumed that each server is connected with a subset of servers in the system. When it breaks down, files on it are re-allocated on the servers that in this subset, following a given policy. Our main interest is the influence on the loads due to two allocation algorithms: the “ Random Choice” (RC) policy and the “Power of Choices” (PoC) policy.
The asymptotic behaviors of these two policies are investigated through mean field models. We have show that when the number of servers getting large, the load of each server can be approached by a linear (resp. non-linear) Markov process for RC (resp. PoC) policy. The equilibrium distributions of these asymptotic processes are also given.
For the case and all the servers are connected, see the paper [15]. This is a joint work with Inria/UPMC Team Regal. For a generalized case, there is a paper in preparation.
Bandwidth Allocation in Large Data Center
We are investigating a problem of efficient resource allocation in a large data center. In our model, the following is assumed. Each job that should be treated arrives to an M/M/C queue and is placed in it if the latter is not exhausted. Otherwise, it is sent to another queue for the possible implementation with the help of a certain canal, whose size is finite. A mean-field or the so called chaoticity result is established. Informally speaking, we show that the stochastic process that describes the evolution of our system converges to a non-random limit. We then study the stability properties of this limiting process and prove that it has a unique equilibrium that attracts exponentially all solutions that are issued from its small neighborhood. Moreover, we also show that if the size of the canal is infinite (i.e., the jobs go freely to another queue when not served), the uniqueness for the fixed point problem is not guaranteed and, depending on some physical parameters, one can have no solution, a unique solution or two solutions. This phenomenon is quite surprising and it seems that is was not observed before. We also investigate the stability of equilibrium points. Some techniques used in our proofs come from theories developed in the context of PDEs.