This page contains theoretical decription of local load balancing methods used in this simulator.
These methods can be divided into two basic approaches:
These approaches have one property in common, the strict locality of the communication and the control.
Assumptions:
- tasks are indivisible and identical,
- task numbers are integers.
Due to the assumption that tasks (i.e. load) are indivisible and task numbers are integers, it is recommended
to simulate load balancing process for many tasks.
1. The diffusion method
This section contains description of the diffusion method and two algorithms used in the simulator.
The diffusion algorithms:
The diffusion method is an iterative algorithm. In every step, a fixed fraction of the load
difference between two neighboring processors is exchanged. When such local operations are
used, the load distribution should converge to the globally equal, flat load distribution.
The efficiency of the diffusion method depends on a diffusion parameter
which determines the size of the transferred load fraction.
Notation:
-
- diffusion parameter,
- N - number of processors,
- i - processor number, i=1..N,
- R - vector of current load processors,
,
-
- current load of processor i,
-
- a set of directly neighbouring processors,
-
- a load transferred from processor i to j.
In every step, a load
is transferred from processor i
to every neighbour
with
For
, the load is transferred in the inverse direction i.e.
from processor j to i.
Every change of state of the
processor load
by synchronous load balancing can be described
by the following transition equation:
, where t
- the current step in time, t +1 - the next time moment |
(2) |
It is possible, that for big value of alfa parameter and big number of directly neighbouring
processors, after one step of load balancing, one of processors will has negative load, according to formula (2).
In this situation we propose two modifications of diffusion approach:
- 1.1. Diffusion (1)
- 1.2. Diffusion (2)
1.1. Diffusion (1)
In the Diffusion (1) algorithm, additional parameter
(parameter
modificator) is usued, where
.
In this connection, equation (1) changes to equation (3):
1.2. Diffusion (2)
The second modification of the diffusion algorithm consists in asynchronous load exchanges.
Each load exchange between the two neighbouring processors is performed independently of teh other
exchanges, and immediately modifies the loads of the two communicating processors.
2. The Nearest Neighbour Averaging
This section contains description of the nearest neighbour averaging method and two algorithms used in the simulator.
[Henrich D., „The Liquid Model Load Balancing Method”, Paper accepted for the
Journal of Parallel Algorithms and Applications, Special Issue on Algorithms for
Enhanced Mesh Architectures,
http://www.ubka.uni-karlsruhe.de/cgi-bin/psgunzip/1994/informatik/1/1.pdf
]
.
The nearest neighbour averaging algorithms:
The nearest-neighbor-averaging (NNA) completely local load balancing method. The idea is to change the load of each
processor such that it is equal to the mean load of the processor and its neighbors.
Notation:
- N - number of processors,
- i - processor number, i=1..N,
- R - vector of current load processors,
,
-
- current load of processor i,
-
- a set of directly neighbouring processors,
-
- a load transferred from processor i to j.
After one load balanvcing step, at time t+1, the processor i should have a load of
:
The NNA can be realized in two different ways:
- 2.1. The Nearest Neighbour Averaging (asyn)
- 2.2. The Nearest Neighbour Averaging (syn)
2.1. The Nearest Neighbour Averaging (asyn)
The Nearest Neighbour Averaging (asyn) is an asynchronous variant of NNA. In this variant, when a processor is highly
loaded then it transfers a portion of its load to all deficient neighbours. The amount of transfered load is proportional
to the difference of the mean load and the load of the neighbour.
Let the deficiency of each neighbouring processor
for the processor i be given by
and the total deficiency by
. Then, the asynchronous
NNA performs a load transfer
from processor i to each of its neighbours
with:
2.2. The Nearest Neighbour Averaging (syn)
In the synchronous variant of NNA, the load of every processor is divided into
portions of the same size. The processor itself and all of its neighbors receive one load portion.
The execution of each load balancing step satisfies the transition equation in (4). Let the processor i have a load
and a set of neighbors
. Then, by the synchronous NNA, a load
transfer
from processor i to each of its neighbors
is performed with: