Models of computation: in distributed setting
Positive - upper bounds
- Efficiently solved with distributed algorithms
Negative - lower bounds
- Can't efficiently solvable with distributed algorithms
Distributed Setting
The distributed network is modeled as a graph.
Nodes - computers
Edges - communication links
Communication between computers: message passing
Distributed VS Centralized
| Distributed Setting | Centralized Setting | |
|---|---|---|
| Interested in: | What can be solved efficiently in a large network of computers | Wat can be solved efficiently with a computer |
| Focus | What happens among computers | What happens inside a computer |
| Key Resources | Communication | Computation |
In distributed setting, we can even assume that each computer is very powerful (solve some NP-hard problems very fast)
Distributed Algorithm
Each node run the same distributed algorithm.
- Internal computation
- Sends message to neighbors
- Receives messages from neighbors
Synchronous and fault-free:
- machines do not fail, messages always arrive on time
- time proceeds at the same speed for all machines
2-coloring a Path
The graph is a path
2-vertex coloring:
- each nodes outputs either black or red
- the node of a color must be different from that of its neighbors
Solution
TRIVIAL
The node with no incoming edges fixes its color to be black, then it sends the message "black" to its successor
When a node receives "black", it fixes to be "red", and sends "red" to its successor.
When a node receives "red", it fixes to be "black", and sends "black" to its successor.
Deterministic Complexity
On a specific graph \(G\): The deterministic complexity \(T_A(G)\) of an algorithm \(A\) on \(G\) is the number of rounds until the last node terminates.
On a graph class \(\mathcal G\): (\(\mathcal G_n\) denotes the class of all \(n\)-node graphs) The deterministic complexity of an algorithm \(A\) on \(\mathcal G\) is the function $$ T_A(n)=\max_{G\in\mathcal G_n}T_A(G). $$
Complexity of a problem: The deterministic complexity of a problem \(P\) on some graph class \(\mathcal G\) is the function $$ T(n)=\min_{A\text{ solving }P}T_A(n), $$ i.e., the complexity of the optimal deterministic algorithm for that problem on \(\mathcal G\).
Classes of Graphs
Paths
Cycles
Trees
General Graphs
Other Parameters
Usually the complexity of algorithms and problems is the function of \(n\).
Some times it is the function of \(n\) and the maximum degree \(\Delta\).
Complexity of 2-coloring
The algorithm runs in \(n-1\) rounds, thus \(O(n)\).
It's possible to prove the problem is also \(\Omega(n)\), thus \(\Theta(n)\).
Unique Identifier
To break symmetry, we introduce unique identifier on each node.
LOCAL Model
Synchronous message passing
Unique ID from \(1\) to \(n^c\).
Unbounded internal computation
Unbounded size of messages
Initial knowledge of a node \(v\):
- \(ID(v)\): its own identifier
- \(deg(v)\): its own degree
- \(n\): the total number of nodes
- \(\Delta\): the maximum degree in the graph
CONGEST Model
Synchronous message passing
Unique ID from \(1\) to \(n^c\).
Unbounded internal computation
Bounded size of messages: \(O(\log n)\) bits
Vertex Coloring
Objective: Assign a color to each node s.t.
- Neighboring nodes get different colors
- The total number of colors is as small as possible
3-vertex Coloring on Cycles
Goal: Propose coloring of the nodes of a cycle using 3 colors
Algorithm: non-colored local minima choose a color that does not conflict with the neighbors.
Local minima is the node with smallest ID among the neighbors.
Observation: Local minima form an independent set
Running Time
Worst-case: sorted IDs, each round we have only one local minima, \(\Theta(n)\).
Notice: If we start from a \(C=O(1)\) coloring, then the runtime is \(O(1)\).
Using Randomness
Each non-colored node chooses uniformly at random in color in \(\{0,1,2\}\backslash X\), where \(X\) is the set of colors chosen by the neighbors that terminated.
Exchange the chosen color with neighbors. If node \(v\) gets a color that is different from the one chosen by the neighbors, then \(v\) keeps the color as its final one.
Runtime: with \(O(\log n)\) rounds, all nodes terminate with probability at least \(1-1/n\).
Let \(c\) be the color that \(v\) chooses uniformly at random, \(u,w\) are the neighbors of \(v\). $$ \Pr[v\text{ terminates}]=\Pr[u\text{ not }c]\cdot\Pr[w\text{ not c}]\geq\frac{1}{2}\cdot\frac{1}{2}=\frac{1}{4}, $$ Why \(1/2\) here? We want lower bound.
hence, $$ \Pr[v\text{ not terminates}]\leq1-\frac{1}{4}=\frac{3}{4}. $$ After \(T\) rounds, the probability of not terminating is \(\leq(3/4)^T\).
For \(n\) vertices, the probability is \(n(3/4)^T\leq 1/n\), which implies \(O(\log n)\) rounds.
Cole-Vishkin Algorithm
Fast 3-coloring of directed cycles
Main Ideas
- Consider each ID as a color. Colors are numbers. poly \(n\) colors in total.
- Design a color reduction scheme, that reduces the colors from poly \(n\) to \(6\) (fast).
- Use the local minima algorithm to reduce from \(6\) to \(3\). (\(O(1)\))
Color Reduction
Consider node \(u\) and its successor \(v\) with colors \(c_u\) and \(c_v\).
- \(x_u\) is the binary representation of \(c_u\).
- \(x_v\) is the binary representation of \(c_v\).
Define
- \(i_u\) to be the index of the first bit where \(x_u\) and \(x_v\) differ
- \(b_u\in\{0,1\}\) is the bit of \(x_u\) in position \(i_u\).
The new color of \(u\) is \(c'_u=2\cdot i_u+b_u\)
Claim: for any two neighbors \(u,v\), if \(c_u\neq c_v\) then \(c'_u\neq c'_v\).
Running Time Analysis
Each round, the color is reduced from \(c\) to \(\lceil\log_2c\rceil\).
So there are \(\log^*c\) rounds to reduce the color number to \(6\).
Total runtime: \(O(\log^*n)\).