Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
CSE 101
Algorithm Design and Analysis
Miles Jones and Russell Impagliazzo
mej016@eng.ucsd.edu
russell@cs.ucsd.edu
Lecture 12: Implementing Kruskal’s
Algorithm Using Data Structures for Disjoint
Sets
ALGORITHMS WITH VERY LARGE OUTPUTS
Challenge: Think of a succinctly describable algorithm A(say, fits
with large writing on an index card)
No system calls
On each integer, A terminates and outputs an integer, not infinity
Goal: Make A(10) as big as possible
ACKERMAN FUNCTIONS
A_1 (n, m) = n+m
A_{i+1} (n,m) = n if m=1, A_{i} (n, A_{i+1}(n,m -1)) if m > 1
So A_{i+1} does A_i to n ``m times’’
Example: A_2(n, m) = n + A_2 (n, m -1) so A_2 (n,m)=
A_3 (n,m) = n * A_3(n,m-1), so A_3 (n, m) =
TOWER FUNCTION
OK, A_4 (n, m) = n^{A_4 (n,m-1)}, so A_4 (n,m)=Tower(n,m)=
……
n
n
n
n
n
m
n
h
i
g
h
THE TOWER FUNCTION GROWS PRETTY LARGE
T(2,2)= 2^2=4
T(2,3)= 2^4 = 16
T(2,4)= 2^{16} around 4000
T(2,5)=2^{4096}, much larger than the number of particles times time
quanta in the universe’s history
T(2,6)= if you used each particle in each time quanta in the universe,
you can’t write this number in binary
T(2,7) = if in each time quanta , each quantum event splits the
universe into parallel universes, and you used each particle in each
time quanta in each parallel universe to store a bit, you can’t write this
number down: a pretty big number
INVERSES OF LARGE FUNCTIONS
If we have a quickly growing function, its inverse defines a non constant, but very slowly growing, function
Example: Exp(n)=2^n, inverse is log (n)
Inverse of Tower(2,n) is called log* n, number of times we take log
before we get below 1.
log*n > 7 only for numbers n that are too big to be written on all
particles in all parallel universes, so doesn’t come up too often
BUT THE TOWER FUNCTION IS JUST THE FOURTH
ACKERMAN FUNCTION
A_5 (2,2) = 4
A_5 (2,3) is a tower of 2’s 4 high, 4096
A_5 (2,4) is a tower of 2’s 4096 high. (start with that huge number,
and exponentiate it another 4089 times)
A_5 (2,5) is a tower of 2’s A_5 (2,4) high…..(I’ll let you imagine)
THE ACTUAL ACKERMAN FUNCTION
Ack( i, n, m) = A_i(n,m) can be defined by the following simple
recursion:
Ack(i,n,m):
IF i=1 return n+m
IF m=1 return n
Return Ack(i-1, n, Ack(i,n,m-1))
Ack(n)= Ack(n,n,n)
Imagine what Ack(10) is. It’s pretty big
Of course, if there’s room left on the index card, we could keep going,
say, looking at Ack composed with itself n times….
INVERSE ACKERMAN
ɑ(n) = smallest j so that Ack(j) ≥ n.
Goes to infinity as n grows, but very, very slowly
HIGH-LEVEL KRUSKAL’S ALGORITHM
Instance: Undirected graph G, with edge weights w(e)
Output: A subset of edges X that form a spanning tree
Start X as the empty set of edges
Go through the edges from smallest weight to highest weight
For each edge e={u,v}, if u is not already connected to v in X, add e to
X
Return X
DATA STRUCTURE FOR DISJOINT SETS
Main complication: Want to check if u is connected to v efficiently
Tree T divides vertices into disjoint sets of connected components
u is connected to v if they are in the same set
Adding e to T merges the set containing u with the set containing v
So we need a data structure that:
Represents a partition of a set V into disjoint subsets. We’ll pick one
element L from each subset to be the ``leader’’of a subset, in order to
give the subsets distinct names
Has an operation find(u) that returns the leader of u’s set
Has an operation union(u,v) that replaces the two sets containing u and
v with their union
KRUSKAL’S ALGORITHM USING DSDS
MAIN FACTORS IN TIME FOR KRUSKAL
Sorting all edges: O(m log m) = O(m log n)
2 find operations per edge, O(m) * Time_find
union when we add an edge to X, happens n -1 times (because a tree
with n vertices always has n-1 edges, O(n) *Time_union
SUBROUTINES OF KRUSKAL’S
DSDS VERSION 1
Keep an array Leader(u) indexed by element
In each array position, keep the leader of its set
Initialize to self
Find(u) : return Leader(u), O(1)
union(u,v) : For each array position, if it’s currently Leader(v), change
it to Leader(u). (O(n) time)
Total time: O(n log n) sort+ O(1)*m Finds, + O(n)*O(n) unions
= O(n^2) total
VERSION 2.0: LISTS
In addition to array, keep each set in a doubly linked list, so that once
we have one element, we can find all of the other elements in constant
time per element.
Find stays the same.
union : add links between the tail of Set(u), to the head of Set(v),
where Set(u) is the set u is currently in, and change all of the array
values in Set(v).
Time: O(|Set(v)|)
But if we say, always add in a single u to a growing Set(v), still order
n^2 total time
VERSION 2.1: STILL LISTS
How should we avoid this?
DSDS 2.1 LISTS WITH SIZES
Keep the size of the list, Size(L) at the leader L
When we perform a union, if Size(u) < Size (v), swap
u and v, so that we are always updating pointers for the
smaller of the two lists.
Then add the two sizes and store at u.
WORST CASE TIME
Find is still constant time
If at least one set is small, merge becomes faster.
But in the worst-case, we could be merging two sets of size n/2.
In that case, we’d still need to update n/2 pointers to the leader,
So total worst-case time is still Ώ(n)
BUT DOES WORST-CASE TIME GIVE US THE TOTAL
PICTURE ?
While we do want a bound on the worst -case time of our algorithm,
the time for our algorithm might be better than the worst -case time for a
step times the number of steps. If only a few steps take their worst case, the sum of time for all steps might be much less than the upper
bound.
In this case, how many times can we have the worst -case behavior?
What must have happened in previous steps to work up to this worst case?
WHAT MIGHT HAVE HAPPENED
All n elements
n/2 elements
n/2 elements
n/4 elts
n/4 elts
n/4 elts
n/4 elts
AMORTIZED ANALYSIS
Techniques for bounding the total cost of data structure operations
that improve over the worst-case
Simple example: We have a register keeping track of total deposits.
Bills get deposited, $1, $10, $100, $1000,..up to 10^n
We need to perform carries in our register.
So if we have $999,999 and one more gets added, we need to perform
7 steps to make the new value $1,000,000
So worst-case cost of deposit is Ώ(n)
But I claim any m deposits take at most O(n +m) time
AMORTIZED ANALYSIS, POTENTIAL METHOD
Worst-case: k 9’s become 1 and k 0’s. But once we create 0’s, many
deposits before we get back to worst -case.
We’ll define P_t to be a ``potential function’’, measuring how far along
we are in the process of building up at time t.
AT_t= Time for t’th operation+ P_t – P_{t-1}.
Then Σ_t AT_t = ∑_t (Time for t’th operation) + (P_T-P_{T-1})+ P_{T1}- P_{T-2})+….(P_1-P_0) = Total time + P_T –P_0
In other words, total time is at most total amortized time,plus initial
potential
ACCOUNT SUM EXAMPLE
In register problem, what should P_t be? How can we measure how
close we are to possible cascades?
P_t =
ACCOUNTING METHOD
Find total cost by dividing responsibility among data elements, rather
than operations.
Our example: Charge each position in the counter 1 each time there
is a carry from that position to the next, C_i be the total charge for
position i.
Observation: If we just look at the value of position i, it only increases
if there is a carry to position i,
A DATA STRUCTURE FOR KRUSKAL’S (DIRECTED TREES
WITH RANKS)
A DATA STRUCTURE FOR KRUSKAL’S (DIRECTED TREES
WITH RANKS)
A DATA STRUCTURE FOR KRUSKAL’S (DIRECTED TREES
WITH RANKS)
SUBROUTINES.
SUBROUTINES.
SUBROUTINES.
To save on runtime, we must keep
the heights of the trees short.
So union of two ranks points the
smaller rank to the bigger rank, that
way, the tree will stay the same
height.
If the ranks are equal, then it
increments one rank and points the
smaller to the bigger. (this is the
only way a rank can increase.)
SUBROUTINES.
makeset=O(1)
find=O(height of tree containing x)
union=O(find)
EXAMPLE
makeset({A,B,C,D,E,F,G})
union(A,D), union(B,E), union(B,F), union(A,G), union(D,G),union(B,D),
union(C,E)
EXAMPLE
makeset({A,B,C,D,E,F,G})
union(B,C), union(E,G), union(D,F), union(A,B), union(D,G),union(A,F)
EXAMPLE
HEIGHT OF TREE
ANCESTORS OF RANK K
any vertex has at most one ancestor of rank k
proof:
each vertex has one pointer and ranks strictly increase along paths so
each element has at most one ancestor of each rank.
NUMBER OF VERTICES OF A GIVEN RANK
HEIGHT OF TALLEST TREE (MAXIMUM RANK)
the maximum rank is log(n)
proof: (sort of)
How many vertices of rank log(n) can there be?
HEIGHT OF TALLEST TREE (MAXIMUM RANK)
makeset=O(1)
find=O(height of tree containing x)
union=O(find)
RUNTIME
makeset=O(1)
find=O(log(n))
union=O(log(n))
RUNTIME
makeset=O(1)
find=O(log(n))
union=O(log(n))
RUNTIME
PRIM’S ALGORITHM
PRIM’S ALGORITHM
The cut property assures us that any algorithm following the guideline of
continuously adding the next lightest edge between two disjoint sets will
work.
Prim’s algorithm always picks the two disjoint sets according to which is
connected and which is not
PRIM’S ALGORITHM
On each iteration, the subtree grows by one edge
PRIM’S ALGORITHM
PRIM/DIJKSTRA’S
Prim’s algorithm is just like Dijkstra’s where we are using the value
cost(v) instead of dist(v).
We can use the same algorithm changing dist to cost and we can use the
same data structures.
PRIM’S ALGORITHM
PRIM’S EXAMPLE
RUNTIME OF PRIM’S
Runtime of Kruskal’s
SET COVER
Suppose you have a county with n towns. If you put a school in town x
then all towns within a 30 mile radius could send their kids to that school.
What is the least amount of schools you could build to accommodate all
towns?
SET COVER
Make a graph where the towns are vertices and two vertices are
connected if they are within 30 miles of each other.
SET COVER
Make a graph where the towns are vertices and two vertices are
connected if they are within 30 miles of each other.
SET COVER
Greedy approach: Pick the town that accommodates the most
number of other towns. Delete all towns it accommodates from the
graph and repeat until all towns are accommodated.
SET COVER
Greedy approach: Pick the town that accommodates the most
number of other towns. Delete all towns it accommodates from the
graph and repeat until all towns are accommodated.
SET COVER
Greedy approach: Pick the town that accommodates the most
number of other towns. Delete all towns it accommodates from the
graph and repeat until all towns are accommodated.
SET COVER
Greedy approach: Pick the town that accommodates the most
number of other towns. Delete all towns it accommodates from the
graph and repeat until all towns are accommodated.
SET COVER
Is 4 the optimal solution?
SET COVER
SET COVER
HOW BAD IS THE GREEDY APPROACH?
Claim: Suppose B contains n elements and that the optimal solution
consists of k sets. Then the greedy approach will use at most kln(n) sets.
In our previous example, k=3 and n=11 so the greedy approach will not
use more than 3*ln(11)=7 sets.
Is it worth it to use the greedy approach?
PROOF OF CLAIM
Claim: Suppose B contains n elements and that the optimal solution
consists of k sets. Then the greedy approach will use at most kln(n)
sets.
PROOF OF CLAIM
PROOF OF CLAIM
PROOF OF CLAIM
PROOF OF CLAIM
When t=kln(n), n_t<1 so no more elements
are covered.
IS GREEDY WORTH IT?
is kln(n) that much bigger than k?
The ratio between the greedy solution and optimal solution is always less
than ln(n).
It turns out that there does not exist a polynomial time algorithm that can
give you a better approximation!!!!!
Runtime of greedy algorithm:
IS GREEDY WORTH IT?
is kln(n) that much bigger than k?
The ratio between the greedy solution and optimal solution is always
less than ln(n).
It turns out that there does not exist a polynomial time algorithm that
can give you a better approximate: on!!!!!
Runtime of greedy algorithm: Based on the data structure used.
GREEDY ALGORITHM FOR SET COVER
for all v in V
makeset(v)