Résumé

Parallel iterative applications often suffer from load imbalance, one of the most critical performance degradation factors. Hence, load balancing techniques are used to distribute the workload evenly to maximize performance. A key challenge is to know when to use load balancing techniques. In general, this is done through load balancing criteria, which trigger load balancing based on runtime application data and/or user-defined information. In the first part of this paper, we introduce a novel, automatic load balancing criterion derived from a simple mathematical model. In the second part, we propose a branch-and-bound algorithm to find the load balancing iterations that lead to the optimal application performance. This algorithm finds the optimal load balancing scenario in polynomial time while, to the best of our knowledge, it has never been addressed in less than an exponential time. Finally, we compare the performance of the scenarios produced by state-of-the-art load balancing criteria relative to the optimal load balancing scenario in synthetic benchmarks and parallel N-body simulations. In the synthetic benchmarks, we observe that the proposed criterion outperforms the other automatic criteria. In the numerical experiments, we show that our new criterion is, on average, 4.9% faster than state-of-the-art load balancing criteria and can outperform them by up to 17.6%. Moreover, we see in the numerical study that the state-of-the-art automatic criteria are at worst 26.43% slower than the optimum and at best 10% slower.

Détails

Actions

PDF