To find the connected start with a set
unvisited
of all vertices. Removee a vertex from the list and add it to a
a new set representing the next connected component.
Perform a depth-first or breadth-first search from that vertex.
All reachable nodes are added to the current connected component,
until no more vertices can be reached. Repeat until
unvisited
is empty. Run time is
O(|V| + |E|)
.
Removal from the unvisited set is O(1). The set can be represented as a (double-) linked list or array/vector. If representing the set as an array, the vertex requires a field to hold the array index and the removed element is swapped with the last element.
We just scratch the surface of graph theory. There are numerous interesting graph problems and algorithms. Flow is a problem useful for computer-network data traffic, power transmission, pipelines, and shipping logistics.
In a network flow problem, there are two distinguished nodes, source and sink. Each edge has a numeric value, capacity. for every edge other than the source and sink, the flow out must equal the flow in. The objective is to achieve the maximum flow from source to sink.
TODO: example diagram
In an unweighted graph, the shortest path between two vertices can be found via breadth-first search.
A weighted graph associates a (non-negative) cost, or weight, with each edge. The weighted-graph shortest path problem sums the weights along the edges of the path and seeks the path between a pair of nodes with the minimum total weight. It is not necessarily the most direct path.
If we need to disambiguate, we can distinguish between shortest path and lowest-cost path.
TODO: diagram
Dijkstra's algorithim is one of many shortest-path algorithms and is an example of a hill-climbing, or greedy algorithm.
This algorithm requires non-negative weights. Other algorithms may work in the presence of negative weights, but no algorithm will work on a graph that has a negative cycle (min-cost path would be undefined).
Two fields must be added to the vertex class:
current_cost
and
predecessor
(or
come_from
).
Initially, all nodes except the start node has
infinite
cost
(INT_MAX
is a useful proxy). The start node has initial cost
0. All nodes start with
NULL
predecessor fields.
If a there exists a path from the start node to some node
u
and an edge
(u,v)
with weight
w
,
then there exists a path from the start node to
v
with a cost
cost(u) + w
.
This cost may be cheaper than the current known cost to
v
.
Define a function that takes two vertices connected by an edge and
weight of the edge.
relax(u, v, w)
:
if v.current_cost > u.current_cost + w(u, v) then
v.current_cost = u.current_cost + w(u, v)
v.predecessor = u // or v.predecessor = (u,v)
The algorithm works with a set of unvisited vertices (initially, all vertices are unvisited). We greedily select the lowest-cost unvisited vertex and remove it from the set. All successor nodes are relaxed. Proceed until the goal node is visited.
The predecessor node field may be followed from the goal node back to the start node. This the path in reverse order; to get the path in forward order, reverse the reverse path.
Note that the predecessor field forms a tree from any node back to the start node (root). The predecessor field is a parent pointer (there are no child pointers).
TODO: argue why this works (greedy choice property).
Analysis:
N = |V|
(number of vertices)
M = |E|
(number of edges)
(N-1) + (N-2) + ... + 2 + 1 = O(N2)
for undirected graphs
N * (N-1) = O(N2)
for directed graphs
O(N2)
edges
O(N)
edges.
O(M)
O(N)
operations performed
O(N)
times
O(N2) + O(M)
which is
O(N2) + O(N2) = O(N2)
for dense graphs
M = O(N log N)
,
we could find a cheaper find-min operation
O(N log N)
cost
TODO: animation