**4.2 Applications**

**5.1.4 Finding shortest path from a single source in a weighted graph 48**

The problem we discuss here is an everyday problem. Say that you drive your car in an unknown town, and want to find the shortest route from your current position to some restaurant other people recommended. A mobile phone application can tell the answer, but how? We will consider the mathematics behind this question. Note that if every street block were of the same length, then the BFS algorithm could find us the optimal route. However, usually this is not the case, we need a more sophisticated, although somewhat similar method.

Assume that we are given a graph G = (V, E) in which every edge vw has a weight l(v, w) ≥ 0. We extend l to paths in the natural way. If P is a path, then l(P) =P

e∈E(P)l(e). One more assumption for convenience: we setl(v, w) =∞ for every vw6∈E.

The following algorithm for finding the shortest paths from a source vertexu∈G to every other vertex was discovered by Dijkstra.

Dijkstra’s algorithm

Input: an edge-weighted graph G= (V, E) and a starting vertex u∈V.

Step 1. Initialization: let S = u, t(u) = 0 and for every v ∈ V let t(v) = l(u, v)

Step 2. If S 6=V

Step 2.aChoose a vertex v ∈V −S such that t(v) = min{t(w) :w∈ V −S}

Step 2.b Ift(v) =∞, then continue with Step 4 Step 2.c LetS =S+v

Step 2.d Lett(w) = min{t(w), t(v) +l(v, w)} for every w∈N(v)∩(V −S)

Step 3. Let l(u, v) = t(v) for every v ∈V Step 4. STOP

Let us give some explanation for the algorithm. First, the t(v) values are tenta-tive distances from u to v. As the algorithm proceeds, in every step we determine the length of the shortest path to some vertex v fromu, wherev is the closest to u among vertices inV −S. At the end the distances are given by the l(u,·) values.

There are two reasons why the algorithm may stop: either S = V, or the algo-rithm recognizes that the graph is disconnected.

Theorem 5.6 (Dijkstra (1959)). Given an edge-weighted graph G and a starting vertexu, Dijkstra’s algorithm correctly computes the l(u, v) values for every v ∈V.

Proof. We will show that at every point in time t(v) = l(u, v) for every v ∈ S, moreover, forw6∈S the value oft(w) is the length of the shortest path fromutow having internal vertices only from S. We do this by induction on the cardinality of S.When |S|= 1,then clearly, t(u) =l(u, u) = 0.

Assume now that the theorem holds for|S| ≤k for some natural number k ≥1.

First we show that if a vertexv ∈V −S is chosen in Step 2.a, then t(u) equals the distance fromu to v.

Assume that this is not the case, there is a vertexv,chosen at Step 2.a such that t(v) is larger, than the distance ofu and v. Consider the shortest pathP fromu to v.By assumptionl(P)< t(v).But then there must be a vertex onP which does not belong to S by the induction hypothesis. Let w be the first vertex from V −S on P.Note that the value of t(w) is the length of the shortest path from u using only vertices from S,hence, t(w)≥t(v). From w there is non-negative length portion of P,so l(P)≥t(v),contradicting to the assumption.

Next we have to show that for everyw∈V −S the value oft(w) is the length of the shortest path from u when we are only allowed to use internal vertices from S.

This holds for |S| ≤k.After increasing S by a vertex v in Step 2.c we may change the value of t(w). But we only change it when t(v) + l(v, w) < t(w). So before changing t(w) was the smallest distance using vertices of S −v by the induction hypothesis, and then it has become the smallest when using vertices fromS−v and v.This finishes the proof.

Remark 5.7. It is vey easy to obtain Dijkstra’s algorithm for directed graphs: the
only change is that in Step 2.d useN^{+}(v) instead of N(v).

Finally we mention that there is an algorithm, that can work with negative edge weights as well, the Bellman-Ford algorithm. However, when there is a cycle with negative total weight, there is no minimum weight path for every pair of vertices, as one can wind around the cycle arbitrary number of times, always decreasing the total weight.

### 5.2 The minimum spanning tree problem

In many real life problems one faces with the question of connecting n points with

“wires” of some length such that a sign can travel between any two points through wires, and the total wire length is minimal. In one of the earliest appearances of this problem, Otakar Boruvka, an engineer formulated and solved this problem when devising the electrical network system in some portion of Czechslovakia. Boruvka published his result in 1926. Since then many solutions were found, at present the fastest is due to Bernard Chazelle. Chazelle’s algorithm is very sophisticated and complex, theoretically is very important, but perhaps not the one to be presented

here. Instead we will discuss the algorithm by Joseph Bernard Kruskal below. The latter has a very clear formulation, and can be analyzed easily.

It is clear from the question that an optimal solution must be a tree onn vertices with minimum total wire length. First, it must contain every vertex, and second, if it had a cycle, then we could make the total length shorter by eliminating the longest edge on the cycle, still keeping the network connected.

Definition. Let G = (V, E) be a simple graph such that for every e ∈ E we have an edge weight w(e). We extend the edge weights to subgraphs ofG in the natural way. IfH ⊆G is any subgraph, then

w(H) =X

e∈H

w(e).

In the minimum spanning tree problem the goal is to find a spanning tree T of minimum total weight w(T).

Kruskal’s algorithm (1956)

Assume, that the edges ofGare sorted according to their weight, ties are broken arbitrarily: w(e1)≤w(e2)≤. . .≤w(em), where E ={e1, e2, . . . , em}.

Step 1. Initialization: let T be the empty graph, andi= 1
Step 2. If T +e_{i} is acyclic, then let T =T +e_{i}

Step 3. If i < m,then let i=i+ 1, and continue with Step 2.

Step 4. Output: minimum spanning tree T

We are going to prove that the above algorithm finds the optimal solution.

Theorem 5.8. Given a connected edge weighted graph G= (V, E), Kruskal’s algo-rithm finds a minimum weight spanning tree of it.

Proof. First we have to show that the outputT is indeed a spanning tree ofG.It is
clear from the way we buildT that it is acyclic. Assume thatT is not connected, it
has at least two components,C_{1} and C_{2}. Since Gis connected, there is at least one
edge e_{k} of G that goes between C_{1} and C_{2}. But then we must have added e_{k} to T
asT +e_{k} is acyclic! This shows that T is a spanning tree ofG.

Now let us assume thatGhas another spanning treeT^{0} for which w(T^{0})< w(T).

Denote e_{l} the first edge according to the ordering of edges that belongs to T^{0} but
not to T. Adding e_{l} to T we create a cycle C. Since T^{0} is a tree, we must have at
least one edge e_{t} ∈ C which belongs to T but not to T^{0}. Moreover, for every edge
e_{i} of C we have w(e_{i}) ≤ w(e_{l}), since otherwise we would have added e_{l} to T. Let

T_{1} be the tree we obtain fromT by adding e_{l} to it and deleting e_{i} from it. Clearly,
w(T)≤w(T_{1}), and T_{1} has one more common edges with T^{0} than T.

If T^{0} has another edge e_{s} that does not belong to T, then we repeat the above
procedure: adde_{s} toT_{1}, and delete an edge e_{q} that does not belong to T^{0}. Call the
new tree we obtain this way. ThenT_{2} is “closer” toT^{0},thanT_{1} was, it has one more
common edges with it. We also have thatw(T_{2})≥w(T_{1}).

Repeating the above procedure at most n−1 times we get a sequence of trees
T, T_{1}, T_{2}, . . . , T_{p}, T^{0} such thatw(T)≤w(T_{1}),in generalw(T_{j})≤w(T_{j+1}),and finally,
w(T_{p})≤w(T^{0}). This shows that T must be a minimum weight spanning tree.

Let us have a remark on the time complexity of Kruskal’s algorithm. If the edges are given as an ordered sequence according to the weights, Kruskal’s algorithm runs in O(m) steps, and is in fact a greedy algorithm. Sorting the edges requires Θ(mlogm) steps, this is the most time demanding part of the algorithm.

And a final remark: if one wants to find any spanning tree of a connected undi-rected graph, then can use the BFS, the DFS algorithms, as well as Kruskal’s algo-rithm. For the latter one can mimic if the edges of the graph had some weights, for example, every edge can have unit weight.

## Chapter 6 Matchings

### 6.1 Definitions

Definition. A set of edgesM ⊆E(G) in a multigraphGis called a matching, if no two edges of M have a common endpoint, and there are no loops in M.

*M*

*G*

Figure 6.1: A matching

For any S⊆E(G), the set of endpoints of edges in S is denoted by V(S), i.e.

V(S) = {v ∈V(G) :v is incident to some edge of S}.

We note that M ⊆E(G) is a matching in G if and only if |V(M)|= 2|M|.

We review some terminologies. For a matching M, we say that M covers (pre-cisely) the vertices ofV(M). We also call the vertices inV(M) the matched vertices of G, and the vertices in V(G)\V(M) are the unmatched vertices. Moreover, if uv ∈M, then we say that M matches u tov.

Definition. A maximum matching (or maximum-cardinality matching) is a match-ing that contains the largest possible number of edges. The size of a maximum matching ofG is denoted byν(G), i.e.

ν(G) = max{|M|:M is a matching in G}.

(Recall that |M|, the size of M, is the number of edges in M by definition.) The parameterν(G) is sometimes called the matching number of G.

A perfect matching in Gis a matching that covers all vertices of G.

Figure 6.2: A perfect matching in the Petersen graph

Of course, not all graphs have a perfect matching. For example, graphs with an odd number of vertices do not have a perfect matching, because every matching covers an even number of vertices. Among other things, the next sections present efficient algorithms and powerful theorems to decide whether a graph has a perfect matching.

It is evident that

ν(G)≤ |V(G)|

2 , (6.1)

as

2|M|=|V(M)| ≤ |V(G)|, and so

|M| ≤ |V(G)|

2

for any matching M of G. Equality occurs in (6.1) if and only if G has a perfect matching.