Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Graph Traversals and Minimum Spanning Trees 15-211: Fundamental Data Structures and Algorithms Rose Hoberman April 8, 2003 Announcements Announcements Readings: • Chapter 14 HW5: • Due in less than one week! • Monday, April 14, 2003, 11:59pm 15-211: Fundamental Data Structures and Algorithms 3 Rose Hoberman April 8, 2003 Today • • • • • More Graph Terminology (some review) Topological sort Graph Traversals (BFS and DFS) Minimal Spanning Trees After Class... Before Recitation 15-211: Fundamental Data Structures and Algorithms 4 Rose Hoberman April 8, 2003 Graph Terminology Paths and cycles • A path is a sequence of nodes v1, v2, …, vN such that (vi,vi+1)E for 0<i<N – The length of the path is N-1. – Simple path: all vi are distinct, 0<i<N • A cycle is a path such that v1=vN – An acyclic graph has no cycles 15-211: Fundamental Data Structures and Algorithms 6 Rose Hoberman April 8, 2003 Cycles BOS SFO DTW PIT JFK LAX 15-211: Fundamental Data Structures and Algorithms 7 Rose Hoberman April 8, 2003 More useful definitions • In a directed graph: • The indegree of a node v is the number of distinct edges (w,v)E. • The outdegree of a node v is the number of distinct edges (v,w)E. • A node with indegree 0 is a root. 15-211: Fundamental Data Structures and Algorithms 8 Rose Hoberman April 8, 2003 Trees are graphs • A dag is a directed acyclic graph. • A tree is a connected acyclic undirected graph. • A forest is an acyclic undirected graph (not necessarily connected), i.e., each connected component is a tree. 15-211: Fundamental Data Structures and Algorithms 9 Rose Hoberman April 8, 2003 Example DAG Undershorts Socks Watch Shoes Pants Shirt Belt Tie a DAG implies an ordering on events Jacket 15-211: Fundamental Data Structures and Algorithms 10 Rose Hoberman April 8, 2003 Example DAG Undershorts Socks Watch Shoes Pants Shirt Belt Tie Jacket 15-211: Fundamental Data Structures and Algorithms In a complex DAG, it can be hard to find a schedule that obeys all the constraints. 11 Rose Hoberman April 8, 2003 Topological Sort Topological Sort • For a directed acyclic graph G = (V,E) • A topological sort is an ordering of all of G’s vertices v1, v2, …, vn such that... Formally: for every edge (vi,vk) in E, i<k. Visually: all arrows are pointing to the right 15-211: Fundamental Data Structures and Algorithms 13 Rose Hoberman April 8, 2003 Topological sort • There are often many possible topological sorts of a given DAG • Topological orders for this DAG : 1 • • • • 1,2,5,4,3,6,7 2,1,5,4,7,3,6 2,5,1,4,7,3,6 Etc. 3 2 4 6 5 7 • Each topological order is a feasible schedule. 15-211: Fundamental Data Structures and Algorithms 14 Rose Hoberman April 8, 2003 Topological Sorts for Cyclic Graphs? 1 2 Impossible! 3 • If v and w are two vertices on a cycle, there exist paths from v to w and from w to v. • Any ordering will contradict one of these paths 15-211: Fundamental Data Structures and Algorithms 15 Rose Hoberman April 8, 2003 Topological sort algorithm • Algorithm – Assume indegree is stored with each node. – Repeat until no nodes remain: • Choose a root and output it. • Remove the root and all its edges. • Performance – O(V2 + E), if linear search is used to find a root. 15-211: Fundamental Data Structures and Algorithms 16 Rose Hoberman April 8, 2003 Better topological sort • Algorithm: – Scan all nodes, pushing roots onto a stack. – Repeat until stack is empty: • Pop a root r from the stack and output it. • For all nodes n such that (r,n) is an edge, decrement n’s indegree. If 0 then push onto the stack. • O( V + E ), so still O(V2) in worst case, but better for sparse graphs. • Q: Why is this algorithm correct? 15-211: Fundamental Data Structures and Algorithms 17 Rose Hoberman April 8, 2003 Correctness • Clearly any ordering produced by this algorithm is a topological order But... • Does every DAG have a topological order, and if so, is this algorithm guaranteed to find one? 15-211: Fundamental Data Structures and Algorithms 18 Rose Hoberman April 8, 2003 Quiz Break Quiz • Prove: – This algorithm never gets stuck, i.e. if there are unvisited nodes then at least one of them has an indegree of zero. • Hint: – Prove that if at any point there are unseen vertices but none of them have an indegree of 0, a cycle must exist, contradicting our assumption of a DAG. 15-211: Fundamental Data Structures and Algorithms 20 Rose Hoberman April 8, 2003 Proof • See Weiss page 476. 15-211: Fundamental Data Structures and Algorithms 21 Rose Hoberman April 8, 2003 Graph Traversals Graph Traversals •Both take time: O(V+E) 15-211: Fundamental Data Structures and Algorithms 23 Rose Hoberman April 8, 2003 Use of a stack • It is very common to use a stack to keep track of: – nodes to be visited next, or – nodes that we have already visited. • Typically, use of a stack leads to a depth-first visit order. • Depth-first visit order is “aggressive” in the sense that it examines complete paths. 15-211: Fundamental Data Structures and Algorithms 24 Rose Hoberman April 8, 2003 Topological Sort as DFS • do a DFS of graph G • as each vertex v is “finished” (all of it’s children processed), insert it onto the front of a linked list • return the linked list of vertices • why is this correct? 15-211: Fundamental Data Structures and Algorithms 25 Rose Hoberman April 8, 2003 Use of a queue • It is very common to use a queue to keep track of: – nodes to be visited next, or – nodes that we have already visited. • Typically, use of a queue leads to a breadthfirst visit order. • Breadth-first visit order is “cautious” in the sense that it examines every path of length i before going on to paths of length i+1. 15-211: Fundamental Data Structures and Algorithms 26 Rose Hoberman April 8, 2003 Graph Searching ??? • Graph as state space (node = state, edge = action) • For example, game trees, mazes, ... • BFS and DFS each search the state space for a best move. If the search is exhaustive they will find the same solution, but if there is a time limit and the search space is large... • DFS explores a few possible moves, looking at the effects far in the future • BFS explores many solutions but only sees effects in the near future (often finds shorter solutions) 15-211: Fundamental Data Structures and Algorithms 27 Rose Hoberman April 8, 2003 Minimum Spanning Trees 15-211: Fundamental Data Structures and Algorithms 28 Rose Hoberman April 8, 2003 Problem: Laying Telephone Wire Central office 15-211: Fundamental Data Structures and Algorithms 29 Rose Hoberman April 8, 2003 Wiring: Naïve Approach Central office Expensive! 15-211: Fundamental Data Structures and Algorithms 30 Rose Hoberman April 8, 2003 Wiring: Better Approach Central office Minimize the total length of wire connecting the customers 15-211: Fundamental Data Structures and Algorithms 31 Rose Hoberman April 8, 2003 Minimum Spanning Tree (MST) (see Weiss, Section 24.2.2) A minimum spanning tree is a subgraph of an undirected weighted graph G, such that • it is a tree (i.e., it is acyclic) • it covers all the vertices V – contains |V| - 1 edges • the total cost associated with tree edges is the minimum among all possible spanning trees • not necessarily unique 15-211: Fundamental Data Structures and Algorithms 32 Rose Hoberman April 8, 2003 How Can We Generate a MST? 9 a 2 5 4 6 d c 15-211: Fundamental Data Structures and Algorithms 9 b 4 5 5 e a 2 5 4 c 34 b 6 d 4 5 5 e Rose Hoberman April 8, 2003 Prim’s Algorithm Initialization a. Pick a vertex r to be the root b. Set D(r) = 0, parent(r) = null c. For all vertices v V, v r, set D(v) = d. Insert all vertices into priority queue P, using distances as the keys 9 a 2 5 4 6 d c 15-211: Fundamental Data Structures and Algorithms b e a b c d 4 5 5 Vertex Parent e - 0 e 35 Rose Hoberman April 8, 2003 Prim’s Algorithm While P is not empty: 1. Select the next vertex u to add to the tree u = P.deleteMin() 2. Update the weight of each vertex w adjacent to u which is not in the tree (i.e., w P) If weight(u,w) < D(w), a. parent(w) = u b. D(w) = weight(u,w) c. Update the priority queue to reflect new distance for w 15-211: Fundamental Data Structures and Algorithms 36 Rose Hoberman April 8, 2003 Prim’s algorithm e d b c a 9 a 2 5 6 d 4 4 c b 0 Vertex Parent e b c d - 5 5 d b c a e 4 5 5 Vertex Parent e b e c e d e The MST initially consists of the vertex e, and we update the distances and parent for its adjacent vertices 15-211: Fundamental Data Structures and Algorithms 37 Rose Hoberman April 8, 2003 Prim’s algorithm d b c a 9 a 2 5 4 c 4 5 5 b 6 d 4 5 5 e a c b 2 4 5 15-211: Fundamental Data Structures and Algorithms Vertex Parent e b e c e d e 38 Vertex Parent e b e c d d e a d Rose Hoberman April 8, 2003 Prim’s algorithm a c b 2 4 5 9 a 2 5 4 c b 6 d 4 5 5 e c b 4 5 15-211: Fundamental Data Structures and Algorithms Vertex Parent e b e c d d e a d 39 Vertex Parent e b e c d d e a d Rose Hoberman April 8, 2003 Prim’s algorithm c b 9 a 2 5 4 c 4 5 b 6 d 4 5 5 e b 5 15-211: Fundamental Data Structures and Algorithms Vertex Parent e b e c d d e a d 40 Vertex Parent e b e c d d e a d Rose Hoberman April 8, 2003 Prim’s algorithm b 9 a 2 5 4 c 5 b 6 d 4 5 5 e The final minimum spanning tree 15-211: Fundamental Data Structures and Algorithms Vertex Parent e b e c d d e a d 41 Vertex Parent e b e c d d e a d Rose Hoberman April 8, 2003 Running time of Prim’s algorithm (without heaps) Initialization of priority queue (array): O(|V|) Update loop: |V| calls • Choosing vertex with minimum cost edge: O(|V|) • Updating distance values of unconnected vertices: each edge is considered only once during entire execution, for a total of O(|E|) updates Overall cost without heaps: O(|E| + |V| 2) When heaps are used, apply same analysis as for Dijkstra’s algorithm (p.469) (good exercise) 15-211: Fundamental Data Structures and Algorithms 42 Rose Hoberman April 8, 2003 Prim’s Algorithm Invariant • At each step, we add the edge (u,v) s.t. the weight of (u,v) is minimum among all edges where u is in the tree and v is not in the tree • Each step maintains a minimum spanning tree of the vertices that have been included thus far • When all vertices have been included, we have a MST for the graph! 15-211: Fundamental Data Structures and Algorithms 43 Rose Hoberman April 8, 2003 Correctness of Prim’s • This algorithm adds n-1 edges without creating a cycle, so clearly it creates a spanning tree of any connected graph (you should be able to prove this). But is this a minimum spanning tree? Suppose it wasn't. • There must be point at which it fails, and in particular there must a single edge whose insertion first prevented the spanning tree from being a minimum spanning tree. 15-211: Fundamental Data Structures and Algorithms 44 Rose Hoberman April 8, 2003 Correctness of Prim’s • Let G be a connected, undirected graph • Let S be the set of edges chosen by Prim’s algorithm before choosing an errorful edge (x,y) x y • Let V' be the vertices incident with edges in S • Let T be a MST of G containing all edges in S, but not (x,y). 15-211: Fundamental Data Structures and Algorithms 45 Rose Hoberman April 8, 2003 Correctness of Prim’s w v • Edge (x,y) is not in T, so there must be a path in T from x to y since T is connected. x y • Inserting edge (x,y) into T will create a cycle • There is exactly one edge on this cycle with exactly one vertex in V’, call this edge (v,w) 15-211: Fundamental Data Structures and Algorithms 46 Rose Hoberman April 8, 2003 Correctness of Prim’s • Since Prim’s chose (x,y) over (v,w), w(v,w) >= w(x,y). • We could form a new spanning tree T’ by swapping (x,y) for (v,w) in T (prove this is a spanning tree). • w(T’) is clearly no greater than w(T) • But that means T’ is a MST • And yet it contains all the edges in S, and also (x,y) ...Contradiction 15-211: Fundamental Data Structures and Algorithms 47 Rose Hoberman April 8, 2003 Another Approach • Create a forest of trees from the vertices • Repeatedly merge trees by adding “safe edges” until only one tree remains • A “safe edge” is an edge of minimum weight which does not create a cycle 9 a 2 5 6 d 4 4 c 15-211: Fundamental Data Structures and Algorithms b 5 5 forest: {a}, {b}, {c}, {d}, {e} e 48 Rose Hoberman April 8, 2003 Kruskal’s algorithm Initialization a. Create a set for each vertex v V b. Initialize the set of “safe edges” A comprising the MST to the empty set c. Sort edges by increasing weight 9 a 2 5 4 6 d c 15-211: Fundamental Data Structures and Algorithms b 4 5 5 e F = {a}, {b}, {c}, {d}, {e} A= E = {(a,d), (c,d), (d,e), (a,c), (b,e), (c,e), (b,d), (a,b)} 49 Rose Hoberman April 8, 2003 Kruskal’s algorithm For each edge (u,v) E in increasing order while more than one set remains: If u and v, belong to different sets U and V a. add edge (u,v) to the safe edge set A = A {(u,v)} b. merge the sets U and V F = F - U - V + (U V) Return A • Running time bounded by sorting (or findMin) • O(|E|log|E|), or equivalently, O(|E|log|V|) (why???) 15-211: Fundamental Data Structures and Algorithms 50 Rose Hoberman April 8, 2003 Kruskal’s algorithm 9 a 2 5 4 c b 6 d E = {(a,d), (c,d), (d,e), (a,c), 4 5 (b,e), (c,e), (b,d), (a,b)} 5 e Forest {a}, {b}, {c}, {d}, {e} {a,d}, {b}, {c}, {e} {a,d,c}, {b}, {e} {a,d,c,e}, {b} {a,d,c,e,b} 15-211: Fundamental Data Structures and Algorithms A {(a,d)} {(a,d), (c,d)} {(a,d), (c,d), (d,e)} {(a,d), (c,d), (d,e), (b,e)} 51 Rose Hoberman April 8, 2003 Kruskal’s Algorithm Invariant • After each iteration, every tree in the forest is a MST of the vertices it connects • Algorithm terminates when all vertices are connected into one tree 15-211: Fundamental Data Structures and Algorithms 52 Rose Hoberman April 8, 2003 Correctness of Kruskal’s • This algorithm adds n-1 edges without creating a cycle, so clearly it creates a spanning tree of any connected graph (you should be able to prove this). But is this a minimum spanning tree? Suppose it wasn't. • There must be point at which it fails, and in particular there must a single edge whose insertion first prevented the spanning tree from being a minimum spanning tree. 15-211: Fundamental Data Structures and Algorithms 53 Rose Hoberman April 8, 2003 Correctness of Kruskal’s T K S e • Let e be this first errorful edge. • Let K be the Kruskal spanning tree • Let S be the set of edges chosen by Kruskal’s algorithm before choosing e • Let T be a MST containing all edges in S, but not e. 15-211: Fundamental Data Structures and Algorithms 54 Rose Hoberman April 8, 2003 Correctness of Kruskal’s Lemma: w(e’) >= w(e) for all edges e’ in T - S Proof (by contradiction): • Assume there exists some edge e’ in T - S, w(e’) < w(e) • Kruskal’s must have e considered e’ before e • However, since e’ is not in K (why??), it must have been discarded because it caused a cycle with some of the other edges in S. • But e’ + S is a subgraph of T, which means it cannot form a cycle ...Contradiction T K S 15-211: Fundamental Data Structures and Algorithms 55 Rose Hoberman April 8, 2003 Correctness of Kruskal’s • Inserting edge e into T will create a cycle • There must be an edge on this cycle which is not in K (why??). Call this edge e’ • e’ must be in T - S, so (by our lemma) w(e’) >= w(e) • We could form a new spanning tree T’ by swapping e for e’ in T (prove this is a spanning tree). • w(T’) is clearly no greater than w(T) • But that means T’ is a MST • And yet it contains all the edges in S, and also e ...Contradiction 15-211: Fundamental Data Structures and Algorithms 56 Rose Hoberman April 8, 2003 Greedy Approach • Like Dijkstra’s algorithm, both Prim’s and Kruskal’s algorithms are greedy algorithms • The greedy approach works for the MST problem; however, it does not work for many other problems! 15-211: Fundamental Data Structures and Algorithms 57 Rose Hoberman April 8, 2003 That’s All!