CS 4104 Algorithms Notes

Published on
183 min read––– views

Contents

Introduction

Notes from CS 4104 with exercises and derivations from "Algorithm Design" by Kleinberg and Tardos.

Glossary

TermDefinition
Matchingeach man is paired with ≤ 1 woman and vice versa
Perfect Matchingeach man is paired with exactly 1 woman and vice versa
Rogue Couplea man and a woman who are not matched, but prefer each other to their
Stable Matchinga perfect matching without any roogue couples
polynomial running timeif there exists constants c,d>0c,d > 0 such that n\forall n, the running time is bounded by cndcn^d
Asymptotic Upper Bounda function f(n)f(n) is O(g(n))O(g(n)) if there exists constants c>0,n00c > 0, n_0 \geq 0 s.t. nn0,f(n)cg(n)\forall n \geq n_0, \quad f(n) \leq cg(n)
Asymptotic Lower Bounda function f(n)f(n) is Ω(g(n))\Omega(g(n)) if there exist constants c>0,n00c > 0, n_0 \geq 0 s.t. nn0,f(n)cg(n)\forall n \geq n_0, \quad f(n) \geq cg(n)
Asymptotic Tight Bounda function f(n)f(n) is Θ(g(n))\Theta(g(n)) if f(n)f(n) is O(g(n))O(g(n)) and f(n)f(n) is Ω(g(n))\Omega(g(n))
Euler PathOnly possible if the number of nodes with odd degree is at most 2
A v1vkv_1 - v_k pathin an undirected graph G=(V,E)G = (V,E) is a sequence of PP nodes v1,v2,,vk1,vk;vkVv_1, v_2, \mathellipsis, v_{k-1}, v_k; v_k \in V s.t. every consecutive pair of nodes vi;vi+1,1i<kv_i; v_{i+1}, 1 \leq i < k is connected by an edge EE
simple pathall its nodes are distinct
cycleis a path where k>2k>2, the first k1k-1 nodes are distinct and v1=vkv_1 = v_k
connected undirected graphfor every pair of nodes u,vVu,v \in V, there is a path from uu to vv in GG
distance d(u,v)d(u,v)the minimum number of edges in any uvu-v path
connected component of GG containing ssthe set of all nodes uu such that there is an sus-u path in GG
Adjaceny Matrixn×nn \times n boolean matrix, where the entry in row ii, col jj is 1 iff the graph contains the edge (i,j)(i, j)
Adjaceny Listarray AdjAdj, where Adj[v]Adj[v] stores the list of all nodes adjacent to vv
bipartiteA graph G=(V,E)G = (V, E) is bipartite if VV can be partitioned into two subsets X,YX,Y such that every edge in EE has one endpoint in XX and one in YY
Greedy algorithmmakes the best current choice without looking back. Assume the choices made prior were perfect
inversionA schedule has an inversion if a job ii with a deadline d(i)d(i) is scheduled before another job jj with an earlier deadline d(j)d(j) i.e. d(j)<d(i),s(i)<s(j)d(j) < d(i), s(i) < s(j)
spanning treeA subset TofET of E is a spanning tree of GG if (V,T)(V, T) is a tree

01 - Stable Matching

Problem

  • Each man ranks all the women in order of preference

  • Each woman ranks all the men in order of preference

  • Each person uses all ranks from 1,...,n1, ... , n s.t. there are no ties or incomplete lists

Male preference matrix:

w1w_1w2w_2w3w_3w4w_4
m1m_11234
m2m_24312
m3m_34312
m4m_43142

Female preference matrix:

m1m_1m2m_2m3m_3m4m_4
w1w_11234
w2w_24132
w3w_34123
w4w_43124
  • A Matching: each man is paired with ≤ 1 woman and vice versa
  • A Perfect Matching: each man is paired with exactly 1 woman and vice versa
    • "perfect" means one-to-one mapping
  • A Rogue Couple: a man and a woman who are not matched, but prefer each other to their current partners
  • A Stable Matching: A perfect matching without any rogue couples

Solution: Gale-Shapley Algorithm

Initially, all men and all women are freeSet S of matched pairs is emptyWhile there is at least one free man who has not proposed to every womanChoose such a man mm proposes to his highest-ranked woman w to whom he has not yet proposedIf w is freeshe becomes engaged to m s.t. SS{m,w}Else if w is engaged to m and she prefers m to mshe becomes engaged to m s.t. SS{m,w}m becomes free s.t. SS\{m,w}Otherwise, m remains freeReturn set S of engaged pairs\boxed{ \begin{aligned} &\text{Initially, all men and all women are free} \\ &\text{Set } S \text{ of matched pairs is empty} \\ &\text{While there is at least one free man who has not proposed to every woman} \\ &\quad\text{Choose such a man } m \\ &\qquad m \text{ proposes to his highest-ranked woman } w \text{ to whom he has not yet proposed} \\ &\quad\text{If } w \text{ is free} \\ &\qquad\text{she becomes engaged to } m \text{ s.t. } S \leftarrow S \cup \{m, w\} \\ &\quad\text{Else if } w \text{ is engaged to } m' \text{ and she prefers } m \text{ to } m' \\ &\qquad\text{she becomes engaged to } m \text{ s.t. } S \leftarrow S \cup \{m, w\} \\ &\qquad m' \text{ becomes free s.t. } S \leftarrow S \backslash \{m', w\} \\ &\quad\text{Otherwise, } m \text{ remains free} \\ &\text{Return set } S \text{ of engaged pairs} \end{aligned} }

Proof of Perfection

  1. Suppose the set SS of pairs returned by Gale-Shapley algorithm is not perfect
  2. SS is a matching, therefore there must be at least one free man mm
  3. mm has proposed to all women (since the algorithm terminated)
  4. Therefore, each woman must be engaged (since she remains engaged after the first proposal to her)
  5. Therefore, all men must be engaged, contradicting ⚔️ the assumption that mm is free
  6. That matching is perfect since the program terminated QED.

Proof of Stability

  1. Suppose SS is not stable i.e. there are two pairs (m1,w1)(m_1, w_1), (m2,w2)S(m_2, w_2) \in S s.t m1m_1 prefers w2>w1w_2 > w_1 and w2w_2 prefers m1>m2m_1 > m_2
  2. m1m_1 must have proposed to w2w_2 prior to w1w_1 because at that stage w2w_2 must have rejected m1m_1; otherwise she would have been paired with w1w_1, which would in turn prevent the pairing of (m2,w2)(m_2, w_2) in a later iteration
  3. When w2w_2 rejected m1m_1, she must have been paired with some man m3>m1m_3 > m_1
  4. Since m2m_2 is paired with w2w_2 at termination, w2w_2 must prefer m2m_2 to m3m_3 or m2=m3m_2 = m_3
  5. This contradicts ⚔️ our conclusion (from instability) that w2w_2 prefers m1>m2m_1 > m_2 QED.

Observations

  • This algorithm computes a matching i.e. each woman gets pair with at most one man and vice versa
  • The man's status can alternate between being free and being engaged
  • The woman's status remains engaged after the first proposal
  • The ranking of a man's partner remains the same or goes down
  • The ranking of a woman's partner can never go down
  • The number of proposals made after kk iterations is the best indicator of progress
  • The max number of total proposals (or iterations) that can be made is n2n^2
  • Always produces the same matching in which each man is paired with his best, valid partner, but each woman is paired with her worst, valid partner

02 - Analysis of Algorithms

Polynomial Time

  • Essentially brute forcing
  • e.g. brute force sorting:
    • Given nn numbers, permute them so that they appear in increasing order
    • Try all n!n! permutations
    • For each permutation, check if it is sorted
    • O(nn!)O(nn!)
  • Desirable scaling property: when the input size doubles, the algorithm should only slow down by some constant cc
  • An algorithm has polynomial running time if there exists constants c,d>0c,d > 0 such that n\forall n, the running time is bounded by cndcn^d
  • an algorithm is said to be efficient if it has a polynomial running time

Upper and Lower Bounds

  • Asymptotic Upper Bound: a function f(n)f(n) is O(g(n))O(g(n)) if there exists constants c>0,n00c > 0, n_0 \geq 0 s.t. nn0,f(n)cg(n)\forall n \geq n_0, \quad f(n) \leq cg(n)
    • e.g. 100nlog2n100n \log_2n is O(n2)O(n^2) for c=100,n0=1c = 100, n_0 = 1
  • Asymptotic Lower Bound: a function f(n)f(n) is Ω(g(n))\Omega(g(n)) if there exist constants c>0,n00c > 0, n_0 \geq 0 s.t. nn0,f(n)cg(n)\forall n \geq n_0, \quad f(n) \geq cg(n)
    • e.g. n10log2n\frac{n}{10}\log_2 n is Ω(n)\Omega(n) for c=1,n0=1024c = 1, n_0 = 1024

In general, we attempt to find a sufficiently large n0n_0 such that our upper and lower bounds are satisfied.

  • Asymptotic Tight Bound: a function f(n)f(n) is Θ(g(n))\Theta(g(n)) if f(n)f(n) is O(g(n))O(g(n)) and f(n)f(n) is Ω(g(n))\Omega(g(n))

Transitively:

  • if f=O(g)f = O(g) and g=O(h)g = O(h), then f=O(h)f = O(h)
  • if f=Ω(g)f = \Omega(g) and g=Ω(h)g = \Omega(h), then f=Ω(h)f = \Omega(h)
  • if f=Θ(g)f = \Theta(g) and g=Θ(h)g = \Theta(h), then f=Θ(h)f = \Theta(h)

Additively:

  • if f=O(h)f = O(h) and g=O(h)g = O(h), then f+g=O(h)f + g = O(h)
  • Similar statements hold for lower and tight bounds
  • if kk is a constant and there are kk functions: fi=O(h),1ikf_i = O(h), 1\leq i \leq k
f1+f2+...+fk=O(h)\begin{aligned} f_1 + f_2 + ... + f_k = O(h) \end{aligned}
  • if f=O(g)f = O(g), then f+g=Θ(g)f + g = \Theta(g)

Examples

p,q,n>0p,q,n > 0

f(n)f(n)g(n)g(n) Reason
pn2+qn+rpn^2 + qn + rΘ(n2)\Theta(n^2)pn2+qn+r(p+q+r)n2pn^2 + qn + r \leq (p + q + r)n^2 , and we can ignore low order terms
pn2+qn+rpn^2 + qn + rO(n3)O(n^3)n2n3, if n1n^2 \leq n^3, \text{ if } n \geq 1
0idaini\displaystyle\sum_{0 \leq i \leq d} a_i n^iΘ(nd)\Theta(n^d)if d>0d > 0 is an integer constant and ad>0a_d > 0
O(n1.59)O(n^{1.59})polynomial timen1.59n^{1.59} is O(n2)O(n^2)
logan\log_a nO(logbn)O(\log_b n)log2(x)=log10(x)log10(2)a,b>1\log_2(x) = \frac{\log_{10}(x)}{\log_{10}(2)} \forall a,b > 1 therefore the base is irrelevant
  •  constant x>0,logn=O(nx)\forall \text{ constant } x > 0, \log n = O(n^x) e.g. logn=n0.00001\log n = n^{0.00001}
  •  constant r>1,d>0,nd=O(rn)\forall \text{ constant } r > 1, d >0, n^d = O(r^n) e.g. n3=O(1.1n)n^3 = O(1.1^n)

03 - Review of Graphs and Priority Queues

Priority Queues: Motivation

Sorting

  • Instance: Nonempty list [x1,x2,,xn][x_1, x_2, \mathellipsis, x_n] of integers
  • Solution: A permutation [y1,y2,,yn][y_1, y_2, \mathellipsis, y_n] of x1,x2,,xnx_1, x_2, \mathellipsis, x_n such that yiyi+1y_i \leq y_{i+1} for all $1 \leq i < n$$

Possible algorithm:

  • insert each number into some data structure DD
  • Repeatedly find the smallest number in DD, output it, and remove it
  • To get O(nlogn)O(n \log n) running time, each "find minimum" step and each remove step must additively take O(logn)O(\log n)

Priority Queues

  • Store a set SS of elements where each element vv has a priority value key(v)key(v)
  • smaller key values = higher priorities
  • Supported operations:
    • find element with smallest key
    • remove smallest element
    • insert an element
    • delete an element
    • update the key of an element
  • element deletion and key update require knowledge of the position of the element in the priority
  • combines benefits of both lists and sorted arrays
  • balanced binary tree
  • heap order. For every element vv at a node ii, the element ww at ii's parent satisfies key(w)key(v)key(w) \leq key(v)
  • Each node's key is at least as large as its parent's
  • storing nodes of the heap in an array:
    • Node at index ii has children at indices 2i2i and 2i+12i+1, with a parent at index i2\lfloor \frac{i}{2} \rfloor
    • index 1 is the root
    • if 2i>n2i > n, where nn is the current number of elements in the heap, the node at index ii is a leaf

Inserting an Element: Heapify-up

  • insert a new element at index n+1n+1
  • Fix heap order using Heapify-up(H, n+1)
Heapify-up(H, i)If i>1 thenlet j=parent(i) =i2If key[H[i]] < key[H[j]] then swap entries H[i], H[j]Heapify-up(H, j)\boxed{ \begin{aligned} &\text{Heapify-up(H, i)} \\ &\text{If } i > 1 \text{ then} \\ &\quad\text{let } j = \text{parent(i) } = \lfloor \frac{i}{2} \rfloor \\ &\quad\text{If key[H[i]] < key[H[j]] then } \\ &\qquad\text{swap entries H[i], H[j]} \\ &\qquad\text{Heapify-up(H, j)} \\ \end{aligned} }

Proof of Running Time Complexity for Heapify-up(i)

  • each invocation decreases the second argument by a factor of at least 2
  • after kk invocations, argument is at most i2k\frac{i}{2^k}
  • Therefore i2k1\frac{i}{2^k} \geq 1 which implies that klog2ik \leq \log_2 i therefore heapify up is O(logi)O(\log i)

Deleting an Element: Heapify-down

  • Suppose HH has n+1n+1 elements
  • Delete element H[i]H[i] moving element at H[n+1]H[n+1] to H[i]H[i]
  • If element at H[i]H[i] is too small, fix the heap order using Heapify-up(H, i)
  • If element at H[i]H[i] is too large, fix heap order using Heapify-down(H, i)
Heapify-down(H, i)let n=length(H)If 2i>n thenTerminate with H unchangedElse If 2i<n thenLet Left, Right =2i,2i+1Let j be the index that minimizes key[H[Left]] and key[H[Right]]Else 2i=n thenLet j=2iIf key[H[j]] < key[H[i]] then swap the entries H[i] and H[j]Heapify-down(h,j)\boxed{ \begin{aligned} &\text{Heapify-down(H, i)} \\ &\text{let } n = length(H) \\ &\text{If } 2i > n \text{ then} \\ &\quad\text{Terminate with } H \text { unchanged} \\ &\text{Else If } 2i < n \text{ then} \\ &\quad\text{Let Left, Right } = 2i, 2i+1 \\ &\quad\text{Let } j \text{ be the index that minimizes key[H[Left]] and key[H[Right]]} \\ &\text{Else } 2i = n \text{ then} \\ &\quad\text{Let } j = 2i \\ &\text{If key[H[j]] < key[H[i]] then } \\ &\quad\text{swap the entries H[i] and H[j]} \\ &\quad\text{Heapify-down(h,j)} \\ \end{aligned} }

Proof of Running Time Complexity for Heapify-down(i)

  • Each invocation increases is second argument by a factor of at least 2
  • after kk invocations arguments must be at least i2kni2^k \leq n, which implies that klog2nik \leq \log_2 \frac{n}{i}
  • Therefore running time is O(log2ni) O(\log_2 \frac{n}{i})

Sorting

  • Instance: Nonempty list [x1,x2,,xn][x_1, x_2, \mathellipsis, x_n] of integers
  • Solution: A permutation [y1,y2,,yn][y_1, y_2, \mathellipsis, y_n] of x1,x2,,xnx_1, x_2, \mathellipsis, x_n such that yiyi+1y_i \leq y_{i+1} for all $1 \leq i < n$$

Final algorithm:

  • Insert each number in a priority queue HH
  • Repeatedly find the smallest number in HH, output it, and delete it from HH

Thus, each insertion and deletion take O(logn)O(\log n) for a total running time of O(nlogn)O(n \log n)


Graphs: Motivation

  • Contact tracing hahaaaaa

Taxonomy of a Graph

  • comprised of vertices and (directed / undirected) edges, they can form face.

  • Euler Path: Only possible if the number of nodes with odd degree is at most 2

  • Formally, an Undirected graph G=(V,E)G = (V, E) where V,EV, E are sets of vertices and edges with EV×VE \subseteq V \times V

    • Elements of EE are unordered pairs
    • Edge(u,v)Edge(u, v) is incident on u,vu,v meaning u,vu,v are neighbors of one another
    • Exactly one edge between any pair of nodes
    • GG contains no self loops, i.e. no edges of the from (u,u)(u,u)
  • Formally, an directed graph G=(V,E)G = (V, E) where V,EV, E are sets of vertices and edges with EV×VE \subseteq V \times V

    • Elements of EE are ordered pairs
    • e=(u,v)e = (u, v) where uu is the tail of the edge ee, vv is the head, and we can say that ee is directed from uu to vv
    • A pair of nodes may be connected by two directed edges (u,v)(u,v) and (v,u)(v,u)
    • GG contains no self loops
  • A v1vkv_1 - v_k path in an undirected graph G=(V,E)G = (V,E) is a sequence of PP nodes v1,v2,,vk1,vk;vkVv_1, v_2, \mathellipsis, v_{k-1}, v_k; v_k \in V s.t. every consecutive pair of nodes vi;vi+1,1i<kv_i; v_{i+1}, 1 \leq i < k is connected by an edge EE

  • a path is simple if all its nodes are distinct

  • a cycle is a path where k>2k>2, the first k1k-1 nodes are distinct and v1=vkv_1 = v_k

  • An undirected graph GG is connected if, for every pair of nodes u,vVu,v \in V, there is a path from uu to vv in GG

  • The distance d(u,v)d(u,v) between nodes u,vu,v is the minimum number of edges in any uvu-v path

  • The connected component of GG containing ss is the set of all nodes uu such that there is an sus-u path in GG

Computing Connected Components

  • Rather than computing the connected component of GG that contains ss and check if tt is in that component, we can "explore" GG starting from ss, maintaining a set RR of visited nodes
Rwill consist of nodes to which shas a pathInitially R={s}While there is ad edge (u,v) where uR,vRAdd v to R\boxed{ \begin{aligned} &R \text{will consist of nodes to which } s \text{has a path} \\ &\text{Initially } R = \{s\} \\ &\text{While there is ad edge } (u,v) \text{ where } u \in R, v \notin R \\ &\quad\text{Add } v \text{ to } R \end{aligned} }
  • Explore GG starting at ss and going "outward" in all directions, adding nodes one "layer" at a time
  • Layer L0L_0 only contains ss
  • Layer L1L_1 contains ss and all its neighbors, etc. etc.
  • Layer L0,L1,,Lj,Lj+1L_0, L_1, \mathellipsis , L_j, L_{j+1} contains all nodes that:
    • do not belong to an earlier layer
    • are connected by an edge to a node in layer LjL_j
    • The shortest path from ss to each node contains jj edges
  • Claim: For each j1j \geq 1, layer LjL_j consists of all nodes exactly at distance jj from SS
  • A non-tree edge is an edge of GG that does not belong to the BFS tree TT

Proof

  • There is a path from ss to tt if an only iff tt is a member of some layer
  • Let vv be a node in layer Lj+1L_{j+1} and $$ubethe"first"nodeinbe the "first" node inL_jsuchthatsuch that(u,v)isanedgeinis an edge inG.Considerthegraph. Consider the graph Tformedbyalledges,directedfromformed by all edges, directed fromutotov$
    • Notice that TT is a tree because it is connected and the number of edges in TT is the number of nodes in all the laters minus one
    • TT is called the BFS Tree

Inductive Proof of BFS Distance Property

  • for every j0j \geq 0, for every node uLju \in L_j, d(s,u)=jd(s, u) = j
  • Basis: k=0k =0, d(s,s)=0d(s,s) = 0
  • Inductive Hypothesis: Assume the claim is true for every node vLkv \in L_k, d(s,v)=kd(s, v) = k
  • Inductive Step: Prove for every node xLk+1x \in L_{k+1}, d(s,x)=k+1d(s, x) = k + 1
    • d(s,x)=d(s,y)+1d(s,x) = d(s, y) + 1 if yy is a node in LkL_k
    • Therefore d(s,x)=k+1d(s,x) = k+1
  • Explore GG as if it were a maze: start from ss, traverse first edge out (to node vv), traverse first edge out of vv, . . . , reach a dead-end, backtrack, repeat
Mark uas explored and add it to the reachable setRFor each edge (u,v)incident to uIf v is not marked, invoke DFS(v)\boxed{ \begin{aligned} &\text{Mark } u \text{as explored and add it to the reachable set} R \\ &\text{For each edge } (u,v) \text{incident to } u \\ &\quad\text{If } v \text{ is not marked, invoke DFS(} v \text{)} \\ \end{aligned} }

Properties of each

  • BFS and DFS visit the same set of nodes, but in a different order

Representing Graphs

  • A Graph G=(V,E)G = (V,E) has two input parameters: V=n,E=m|V| =n, |E| = m, with n1m(n2) n-1 \leq m \leq \binom{n}{2}
    • Size of graph is defined to be m+nm + n
    • Strive for algorithms whose running time is linear in graph size i.e. )(m+n))(m + n)
  • We assume that V=1,2,,nV = {1, 2, \mathellipsis, n}
  • Use an Adjaceny Matrix: n×nn \times n boolean matrix, where the entry in row ii, col jj is 1 iff the graph contains the edge (i,j)(i, j)
    • the space used is Θ(n2)\Theta(n^2), which is optimal in the worst case
    • can check if there is an edge between nodes ii, jj in O(1)O(1) time
    • iterate over all the edges incident on node ii in Θ(n)\Theta(n) time
  • Use an Adjaceny List: array AdjAdj, where Adj[v]Adj[v] stores the list of all nodes adjacent to vv
    • an edge e=(u,v)e = (u,v) appears twice: Adj[u],Adj[u], Adj[v]$
    • nvn_v is the number of neighbors of node vv
    • space used is O(vGnv)=O(m+n)O(\displaystyle\sum_{v \in G}n_v) = O(m+n)
    • check if there is an adge between nodes u,vu, v in O(nu)O(n_u) time
    • Iterate over all the edges incident on node uu in Θ(nu)\Theta(n_u) time
    • Inserting an edge takes O(1)O(1) time
    • Deleting an edge takes O(n)O(n) time
Operation/SpaceAdj. MatrixAdj. List
Is (i,j)(i, j) an edge?O(1)O(1) timeO(ni)O(n_i) time
Iterate over edges incident on node iiO(n)O(n) timeO(ni)O(n_i) time
Space usedO(n2)O(n^2)O(m+n)O(m + n)

04 - Linear-Time Graph Algorithms

Bipartite Graphs

  • A graph G=(V,E)G = (V, E) is bipartite if VV can be partitioned into two subsets X,YX,Y such that every edge in EE has one endpoint in XX and one in YY

    • X×XE=X \times X \cap E = \emptyset and Y×YE=Y \times Y \cap E = \emptyset
    • Color the nodes in XX red and the nodes in YY blue. No edge in EE connects nodes of the same graph
  • no cycle with an odd number of nodes is bipartite, similarly all cycles of even length are bipartite

  • Therefore, if a graph is bipartite, then it cannot contain a cycle of odd length

Algorithm for Testing Bipartite

  • Assume GG is connected, otherwise apply the algorithm to each connected component separately
  • Pick an arbitrary node ss and color it red.
  • Color all it's neighbors blue. Color the uncolored neighbors of these nodes red, and so on till all nodes are colored
  • Check if every edge has endpoints of different colours

more formally:

  1. Run BFS on GG. Maintain an array for the color
  2. When we add a node vv to a layer ii, set color[i]\text{color[i]} to red if ii is even, blue of odd
  3. At the end of BFS, scan all the edges to check if there is any edge both of whose endpoints received the same color

Running time is linear proportional to the size of the graph: O(m+n)O(m + n) since we do a constant amount of work per node in addition to the time spent by BFS.

Proof of Correctness of a "two-colorable" algorithm

Need to prove that if the algorithm says GG is bipartite, then it is actually bipartite AND need to prove that if GG is not bipartite, can we determine why?

  • Let GG be a graph and let L0,L1,,LkL_0, L_1, \mathellipsis, L_k be the layers produced by the BFS, starting at node ss. Then exactly one of the following statements is true:
    • No edge of GG joins two nodes in the same layer: Bipartite since nodes in the even laters can be colored red and those in odd blue
    • There is an edge of GG that joins two nodes in the same layer: Not Bipartite
    • LiLj=1,LBFS| L_i - L_j | = 1, \quad \forall L \in BFS

05 - Greedy Algorithms

  • Greedy algorithms: make the best current choice without looking back. Assume the choices made prior were perfect

Example Problem: Interval Scheduling

  • At an amusement park, want to compute the largest number of rides you can be on in one day

  • Input: start and end time of each ride

  • Constraint: cannot be in two places at one time

  • Instance: nonempty set {(s(i),f(i)),1in}\{(s(i), f(i)), 1 \leq i \leq n\} of start and finish times of nn jobs

  • Solution: The largest subset of mutually compatible jobs

  • Two jobs are compatible if they do not overlap

  • Problem models the situation where you have a resource, a set of fixed jobs, and you want to schedule as many jobs as possible

  • For any input set of jobs, the algorithm must provably compute the largest set of compatible jobs (measured by interval count, not cumulative interval length)

Template for a Greedy Algorithm

  • Process jobs in some order. Add next job to the result if it is compatible with the jobs already in the result
  • key question: in what order should we process job?
    • earliest start time: increasing order of start time s(i)s(i)
    • earliest finish time: increasing order of start time f(i)f(i)
    • shortest interval: increasing order of f(i)s(i)f(i) - s(i)
    • fewest conflicts: increasing order of the number of conflicting jobs

Earliest Finish Time

  • the most optimal general solution
Initially let R be the set of all jobs, and let A be emptyWhile R is not yet emptyChoose a job iR that has the smallest finishing time Add request i to ADelete all jobs from R that are not compatible with job iReturn the set A as the set of accepted/scheduled jobs\boxed{ \begin{aligned} &\text{Initially let } R \text{ be the set of all jobs, and let } A \text{ be empty} \\ &\text{While } R \text{ is not yet empty}\\ &\quad\text{Choose a job } i \in R \text{ that has the smallest finishing time } \\ &\quad\text{Add request } i \text{ to } A \\ &\quad\text{Delete all jobs from } R \text{ that are not compatible with job } i \\ &\text{Return the set } A \text{ as the set of accepted/scheduled jobs} \\ \end{aligned} }

Proof of Optimality

  • Claim A|A| is a compatible set of jobs that is the largest possible in any set of mutually compatible jobs
  • Proof by contradiction that there's no "better" solution at each step of the algorithm
    • need to define "better", and what a step is in terms of the progress of the algorithm
    • order the output in terms of increasing finish time for a measure of progress
  • Finishing time of a job rr selected by AA must be less than or equal to the finishing time of a job rr selected by any other algorithm
  • Let OO be an optimal set of jobs. We will show that A=O|A| = |O|
  • Let i1,i2,,iki_1, i_2, \mathellipsis, i_k be the set of jobs in AA in order of finishing time
  • Let j1,j2,,jmj_1, j_2, \mathellipsis, j_m be the set of jobs in OO in order of mkm \geq k
  • Claim: for all indices rk,f(ir)f(jr)r \leq k, \quad f(i_r) \leq f(j_r)
  • Base case: is it possible for finish time of the first job of our algorithm to be later than the opposing job?
    • f(i1)>f(j1)f(i_1) > f(j_1) is not possible, only f(i1)f(j1)f(i_1) \leq f(j_1)
  • Inductive Hypothesis: this is always the case for any generic job index rr:
    • f(ir)f(jr)f(i_r) \leq f(j_r) for some rkr \leq k
  • Inductive Step: show that the same claim holds for the next job:
    • f(ir+1)f(jr+1)f(i_{r+1}) \leq f(j_{r+1}) for some r+1kr+1 \leq k
    • s(jr+1)f(ir)f(ir)s(j_{r+1}) \geq f(i_r) \geq f(i_r)
  • claim m=km = k: f(ik)f(jk)s(jk+1)f(i_{k}) \leq f(j_{k}) \leq s(j_{k+1})

A complete proof can be found here:

Proof for Optimality of Earliest Finish Time Algorithm for Interval Scheduling

Implementing EFT

  • Sort jobs in order of increasing finish time
  • Store starting time of jobs in an array SS
  • k=1k=1
  • While kSk \leq |S|
    • output job kk
    • Let the finish time of job kk be ff
    • Iterate over SS from kk onwards to find the first index ii s.t. S[i]fS[i] \geq f
    • k=ik = i
  • Must be careful to iterate over SS s.t. we never scan same index more than once
  • Running time is O(nlogn)O(n \log n) since it's dominated by the first sorting step

Scheduling to Minimize Lateness

  • Suppose a job ii has a length t(i)t(i) and a deadline d(i)d(i)
  • want to schedule all nn jobs on one resource
  • Goal is to assign a starting time s(i)s(i) to each job such that each job is delayed as little as possible
  • A job ii is delayed if f(i)>d(i)f(i) > d(i); the lateness of job is max(0,f(i)d(i))\max(0, f(i) - d(i))
  • the lateness of a schedule is max1in(max(0,f(i)d(i)))\max \limits_{1 \leq i \leq n}(\max(0, f(i) - d(i)))
    • the largest of each job's lateness values

Minimizing Lateness

  • Instance: Set {(t(i),d(i)),1in}\{ (t(i), d(i)), 1 \leq i \leq n \} of lengths of deadlines of nn jobs
  • Solution: Set {s(i),1in}\{ s(i), 1 \leq i \leq n \} such that max1in(max(0,f(i)d(i)))\max \limits_{1 \leq i \leq n}(\max(0, f(i) - d(i))) is as small as possible

Template for Greedy Algorithm

  • Key question: In what order should we schedule the jobs:

    • Shortest length: increasing order of length t(i)t(i). Ignores deadlines completely, shortest job may have a very late deadline:
    ii12
    t(i)t(i)110
    d(i)d(i)10010
    • Shortest slack time: Increasing order of d(i)t(i)d(i) - t(i). Bad for long jobs with late deadlines. Job with smallest slack may take a long time:
    ii12
    t(i)t(i)110
    d(i)d(i)210
    • Earliest Deadline: Increasing order deadline d(i)d(i). Correct? Does it make sense to tackle jobs with earliest deadlines first?

Proof of Earliest Deadline Optimality

Order the jobs in order of increase deadlines d(i)Assume for simplicity of notation that d(1)d(n)Initially, f=0Consider the jobs i=1,,ninthisorderAssign the job i to the time interval from s(i)=f to f(i)=f+tiff+tiReturn the set of scheduled intervals[s(i),f(i)], for i=1,,n\boxed{ \begin{aligned} &\text{Order the jobs in order of increase deadlines } d(i) \\ &\text{Assume for simplicity of notation that } d(1) \leq \mathellipsis d(n) \\ &\text{Initially, } f = 0 \\ &\text{Consider the jobs } i=1, \mathellipsis, n { in this order }\\ &\quad\text{Assign the job } i \text{ to the time interval from } s(i) = f \text{ to } f(i) = f + t_i\\ &\quad f \leftarrow f + t_i \\ &\text{Return the set of scheduled intervals} [s(i), f(i)], \text{ for } i = 1, \mathellipsis, n\\ \end{aligned} }

Inversions

  • A schedule has an inversion if a job ii with a deadline d(i)d(i) is scheduled before another job jj with an earlier deadline d(j)d(j) i.e. d(j)<d(i),s(i)<s(j)d(j) < d(i), s(i) < s(j)
    • if two jobs have the same deadline, they cannot cause an inversion
  • in nn jobs, the maximum amount of inversion is (n2)n \choose 2
  • Claim: if a schedule has and inversion, then there must be a pair of jobs i,ji, j such that jj is scheduled immediately after ii and d(j)<d(i)d(j) < d(i)

Proof of Local Inversion

  • If we have an inversion between l,ml, m s.t. s(l)<s(m),d(l)>d(m)s(l) < s(m), d(l) > d(m), then we can find some inverted i,ji,j scheduled between l,ml,m
  • This is because: in a list where each element is greater than the last (in terms of deadline), then the list is sorted.
  • The contrapositive of this is: if a list is unsorted, then there are two adjacent elements that are unsorted.

Properties of Schedules

  • Claim 1: The algorithm produces a schedule with no inversion and no idle time (i.e. jobs are tightly packed)
  • Claim 2: All schedules (produced from the same input) with no inversions have the same lateness
    • Case 1: All jobs have distinct deadlines. There is a unique schedule with no inversions and no idle time.
    • Case 2: Some jobs have the same deadline. Ordering of the jobs does not change the maximum lateness of these jobs.
  • Claim 3: There is an optimal schedule with no idle time.
  • Claim 4: There is an optimal schedule with no inversions and no idle time.
    • Start with an optimal schedule OO (that may have inversions) and use an exchange argument to convert OO into a schedule that satisfies Claim 4 and has lateness not larger than OO.
    • If OO has an inversion, let i,ji, j be consecutive inverted jobs in OO. After swapping i,ji, j we get a schedule OO' with one less inversion.
    • Claim: The lateness of OO' is no larger than the lateness of OO
    • It is sufficient to prove the last item, since after (n2)n \choose 2 swaps, we obtain a schedule with no inversions whose lateness is no larger than that of OO
    • In OO, assume each job rr is scheduled for the interval [s(r),f(r)][s(r), f(r)] and has lateness l(r)\mathcal l(r). For OO', let the lateness of rr be l(r)l'(r)
      • Claim: l(k)=l(k),ki,jl'(k) = l(k), \quad \forall k \neq i,j
      • Claim: l(j)=l(j)l'(j) = l(j),
      • Claim: l(j)l(j)l'(j) \leq l(j) because l(j)=f(j)dif(j)dj=l(j)l'(j) = f(j) - d_i \leq f(j) - d_j = l(j)
    • N.B. common mistakes with exchange arguments:
      • Wrong: start with algorithm's schedule A and argue it cannot be improved by swapping two jobs
      • Correct: Start with an arbitrary schedule O (which can be optimal) and argue that O can be converted into a schedule that is essentially the same as A without increasing lateness
      • Wrong: Swap two jobs that are not neighboring in OO. Pitfall is that the completion time of all intervening jobs then changes
      • Correct: Show that an inversion exists between two neighboring jobs and swap them
  • Claim 5: The greedy algorithm produces an optimal schedule, follows from 1, 2, 4

Summary

  • Greedy algorithms make local decisions
  • Three strategies:
    • Greedy algorithm stays ahead - Show that after each step in the greedy algorithm, its solution is at least as good as that produced by any other algorithm
    • Structural bound - First, discover a property that must be satisfied by every possible solution. Then show that the (greedy) algorithm produces a solution with this property
    • Exchange argument - Transform the optimal solution in steps into the solution by the greedy algorithm without worsening the quality of the optimal solution

06 - Greedy Graph Algorithms

The Shortest Path Problem

  • G(V,E)G(V,E) is a connected, directed graph. Each edge has a length l(e)0l(e) \geq 0
  • length of a path PP us the sum of the lengths of the edges in PP
  • Goal: compute the shortest path from a specified start node ss to every other node in the graph
  • Instace: A directed graph G(V,E)G(V,E), a function I:ER+I: E \rightarrow \reals^+ and a node sVs \in V
  • Solution: A set {Pu,uV}\{P_u, u \in V\} of paths, where PuP_u is the shortest path in GG from ss to uu

Generalizing BFS

  • If all edges have the same wight, or distance, BFS would work, processing nodes in order of distance.
  • What if the graph has integer edge weights, can we make the graph unweighted?
    • yes, placing dummy edges and nodes to pad out lengths > 1 at the expense of memory and running time
    • Edge weight of ww gets w1w - 1 nodes
    • Size of the graph (and therefore runtime) becomes m+n+eEl(e)m + n + \sum_{e \in E} l(e). Pseudo-polynomial time depending on input values

Dijkstra's Algorithm

  • Famous for pointing out the harm of the goto command, in doing so developed the shortest path algorithm
  • Like BFS: explore nodes in non-increasing order of distance ss. Once a node is explored, its distance is fixed
  • Unlike BFS: Layers are not uniform, edges are explored by evaluating candidates by edge wight
  • For each unexplored node, determine "best" preceding explored node. Record shortest path length only through explored nodes
  • Like BFS: Record previous node in path, and build a tree.

Formally:

  • Maintain a set SS of explored nodes
    • For each node uSu \in S, compute d(u)d(u), which (we will prove, invariant) is the length of the shortest path from ss to uu
    • For each node xSx \notin S, maintain a value d(x)d'(x), which is the length of the shortest path from ss to xx using only the nodes in SS and xx itself
  • Greedily add a node vv to SS that has the smallest value of d(v)d'(v)
S={s} and d(s)=0while SVfor every node xVSSet d(x)=min(u,x):uS(d(u)+l(u,x))Set v=argminxVS(d(x))Add v to S and set d(v)=d(v)\boxed{ \begin{aligned} &S = \{s\} \text{ and } d(s) = 0 \\ &\text{while } S \neq V \\ &\quad\text{for every node } x \in V - S \\ &\qquad\text{Set } d'(x) = \min_{(u, x): u \in S} (d(u) + l(u, x)) \\ &\quad\text{Set } v = \arg \min_{x \in V - S }(d'(x)) \\ &\quad\text{Add } v \text{ to } S \text{ and set } d(v) = d'(v)\\ \end{aligned} }

What does v=argminxVS(d(x))v = \arg \min_{x \in V - S }(d'(x)) mean?

  • Run over all unexplored nodes xVSx \in V - S
  • examine all dd' values for each node
  • Return the argument (i.e. the node) that has the smallest value of d(x)d'(x)

To compute the shorts paths: when adding a node vv to SS, store the predecessor uu that minimizes d(v)d'(v)

Proof of Correctness

  • Let PuP_u be the path computed by the algorithm for an arbitrary node uu

  • Claim: PuP_u is the shortest path from ss to uu

  • Prove by induction on the size of SS

    • Base case: S=1|S| = 1. The only node in SS is ss
    • Inductive Hypothesis: S=k|S| = k for some k1k \geq 1. The algorithm has computed PuP_u for every node uSu \in S. Strong induction.
    • Inductive step: S=k+1|S| = k+1 because we add the node vv to SS. Could the be a shorter path PP from ss to vv? We must prove this is not the case.
      • poll: PP' must contain an edge from x to y where x is explore (in S) and y is unexplored (in V - S)
      • poll: The node v in P' must be explored (in S)
  • Locate key nodes on PP'

    • Break PP' into sub-paths from ss to xx, xx to yy, yy to vv - use ll to denote the lengths of the sub-paths in PP' - d(x)l(s,x)d(x) \leq l(s, x) - d(u)+l(u,v)d(x)+l(x,y)d(u) + l(u,v) \leq d(x) + l(x, y) - 0l(y,v)0 \leq l(y,v) - d(u)+l(u,v)=d(v)l(P)=l(s,x)+l(x,y)+l(y,v)d(u) + l(u,v) = d(v) \leq l(P') = l(s,x) + l(x,y) + l(y,v)

Observations about Dijkstra's Algorithm

  • As described, it cannot handle negative edge lengths
  • Union of shortest paths output by Dijkstra's forms a tree: why?
  • Union of the shortest paths from a fixed source ss forms a tree; paths not necessarily computed by computed by Dijkstra's

Running time of Dijkstra's

S={s} and d(s)=0while SVfor every node xVSSet d(x)=min(u,x):uS(d(u)+l(u,x))Set v=argminxVS(d(x))Add v to S and set d(v)=d(v)\boxed{ \begin{aligned} &S = \{s\} \text{ and } d(s) = 0 \\ &\text{while } S \neq V \\ &\quad\text{for every node } x \in V - S \\ &\qquad\text{Set } d'(x) = \min_{(u, x): u \in S} (d(u) + l(u, x)) \\ &\quad\text{Set } v = \arg \min_{x \in V - S }(d'(x)) \\ &\quad\text{Add } v \text{ to } S \text{ and set } d(v) = d'(v)\\ \end{aligned} }
  • VV has nn nodes and mm edges, so their are n1n-1 iterations on the while loops
  • In each iteration, for each node xVSx \in V - S, compute d(x)=min(u,x):uS(d(u)+l(u,x))d'(x) = \min_{(u, x): u \in S} (d(u) + l(u, x)) which is proportional the the number of edges incident on xx
  • Running time per iteration is O(m)O(m), since the algorithm processes each edge (u,x)(u,x) in the graph exactly once (when computing d(x)d'(x))
  • Therefore, the overall running time is O(mn)O(mn)

Optimizing our Algorithm

  • Observation, if we add vv to SS, d(x)d'(x) changes only if (v,x)(v, x) is an edge in GG
  • Idea: For each node xVSx \in V - S, store the current value of d(x)d'(x). Upon adding a node vv to SS, update d()d'() only for neighbors of vv
  • How do we efficiently compute v=argminxVS(d(x))v = \arg \min_{x \in V - S }(d'(x))
    • Priority Queue!
Insert(Q,s,0)S={s} and d(s)=0while SV(v,d(v))=ExtractMin(Q)Add v to S and set d(v) = d’(v)for every node xVS s.t.(v,x) is an edge in GIf d(v)+l(v,x)<d(x)d(x)d(v)+l(v,x)ChangeKey(Q,x,d(x))\boxed{ \begin{aligned} &Insert(Q, s, 0) \\ &S = \{s\} \text{ and } d(s) = 0 \\ &\text{while } S \neq V \\ &\quad(v, d'(v)) = ExtractMin(Q) \\ &\quad\text{Add v to S and set d(v) = d'(v)}\\ &\quad\text{for every node } x \in V - S \text{ } s.t. (v,x)\text{ is an edge in } G \\ &\qquad\text{If } d(v) + l(v,x) < d'(x) \\ &\qquad\quad d'(x) d(v) + l(v,x) \\ &\qquad\quad ChangeKey(Q,x,d'(x)) \\ \end{aligned} }
  • For each node xVSx \in V - S, store the pair (x,d(x))(x, d'(x)) in a priority queue QQ with d(x)d'(x) as the key
  • Determine the next node vv to add to SS using ExtractMinExtractMin
  • After adding vv to SS, for each node xVSx \in V - S such that there is an edge from vv to xx, check if d(x)d'(x) should be updated i.e. if there is a shortest path from ss to xx via vv
  • in line 8, if xx is not in QQ, simply insert it

New Runtime

  • ExtractMinExtractMin happens n1n - 1 times
  • For every node vv, the running time of step 5 O(degv)O(deg_v), the number of outgoing neighbors of vv: vVO(degv)=O(m)\sum_{v \in V} O(deg_v) = O(m)
  • ChangeKeyChangeKey is invoked at most once for each edge, mm times, and is an O(logn)O(\log n) operation
  • So, the total runtime is O(mlogn)O(m \log n)

Network Design

  • want to connect a set of nodes using a set of edges with certain properties
  • Input is usually a graph, and the desired network (output) should use a subset of edges in the graph
  • Example: connect all nodes using a cycle of shortest total length. This problem is the NP-complete traveling salesman problem

Minimum Spanning Tree (MST)

  • Given an undirected graph G(V,E)G(V,E) with a cost c(e)>0c(e) > 0 associated with each edge eEe \in E.
  • Find a subset TT of edges such that the graph (V,T)(V,T) is connected and the cost eTc(e)\sum_{e \in T} c(e) is as small as possible
  • Instance: An undirected graph G(V,E)G(V,E) and a function c:ER+c: E \rightarrow \R^+
  • Solution: A set TET \sube E of edges such that (V,T)(V, T) is connected and the cost eTc(e)\sum_{e \in T} c(e) is as small as possible
  • Claim: if TT is a minimum-cost solution to this problem, then (V,T)(V,T) is a tree
  • A subset TofET of E is a spanning tree of GG if (V,T)(V, T) is a tree

Characterizing MSTs

  • Does the edge of smallest cost belong to an MST? Yes
    • Wrong proof: Because Kruskal's algorithm adds it.
    • Right proof: tbd
  • Which edges must belong to an MST?
    • What happens when we delete an edge from an MST?
    • MST breaks up into sub-trees
    • Which edge should we add to join them?
  • Which edges cannot belong to an MST?
    • What happens when we add an edge to an MST?
    • We obtain a cycle
    • Which edge in the cycle can we be sure does not belong to an MST

Greedy Algorithm for the MST Problem

  • Template: process edges in some order. Add an edge to TT if tree property is not violated.
    • increasing cost order: Process edges in increasing order of cost. Discard an edge if it creates a cycle – Kruskal's
    • Dijkstra-like: Start from a node ss and grow TT outward from ss: add the node that can attached most cheaply to current tree – Prim's
    • Decreasing cost order: Delete edges in order of decreasing cost as long as graph remains connected – Reverse-Delete
    • each of these works
    • Simplifying assumption: all edge costs are distinct

Graph Cuts

  • a cut in a graph G(V,E)G(V,E) is a set of edges whose removal disconnects the graph (into two or more connected components)
  • Every set SVS \sub V (SS cannot be empty or the entire set VV) has a corresponding cut: cut(SS) is the set of edges (v,w) such that vS,wVSv \in S, w \in V - S
  • cut(SS) is a "cut" because deleting the edges in cut(SS) disconnects SS from VSV - S
  • Claim: for every SV,SS \sub V, S \neq \empty, every MST contains the cheapest edge in cut(S(S)
    • will have to proof by contradiction using exchange argument

Proof of Cut Property of MSTs

  • Negation of the desired property: There is a set SVS \sub V and an MST TT such that TT does not contain the cheapest edge in cut(SS)
  • Proof strategy: If TT does not contain ee, show that there is a tree with a smaller cost than TT that contains ee.
  • Wrong proof:
    • Since TT is spanning, it must contain some edge e.g. ff in cut($S)
    • T{f}{e}T - \{f\} \cup \{e\} has smaller cost than TT but may not be a spanning tree
  • Correct proof:
    • Add ee to TT forming a cycle
    • This cycle must contain an edge ee' in cut(SS)
    • T{e}{e}T - \{e'\} \cup \{e\} has smaller cost than TT and is a spanning tree

Prim's Algorithm

  • Maintain a tree (S, T), i.e., a set of nodes and a set of edges, which we will show will always be a tree
  • Start with an arbitrary node sSs \in S
  • Step 3 is the cheapest edge in cut(S) for the currently explored set S
  • In other words, each step in Prim's algorithm computes and adds the cheapest edge in the current value of cut(SS)
    • argmin(u,v):uS,vVSc(u,v)argmin(u,v)cut(S)c(u,v)\arg \min_{(u,v):u \in S, v \in V - S} c(u,v) \equiv \arg \min_{(u,v) \in cut(S)} c(u,v)
S={s} and T=while SVCompute (u,v)=argmin(u,v):uS,vVSc(u,v)Add the node v to S and add the edge (u,v) to T\boxed{ \begin{aligned} &S = \{s\} \text{ and } T = \empty \\ &\text{while } S \neq V \\ &\quad\text{Compute } (u, v) = \arg \min_{(u,v):u \in S, v \in V - S} c(u,v) \\ &\quad\text{Add the node } v \text{ to } S \text{ and add the edge } (u,v) \text{ to } T \end{aligned} }

Optimality of Prim's

  • Claim: Prim's algorithm outputs an MST
    • Prove that every edge inserted satisfies the cut property (true by construction, in each iteration (u,v)(u, v) is necessarily the cheapest edge in cut(SS) for the current value of SS)
    • Prove that the graph constructed is a spanning tree
      • Why are there no cycles in (V,T)(V, T)
      • Why is (V,T)(V,T) a spanning tree (edges in TT connect all nodes in VV) - Because GG is connected, if there were an unconnected node vv at the end, then Prim's would not have terminated

Final version of Prim's Algorithm

Insert(Q,s,0,)while SV(v,a(v),u)=ExtractMin(Q)Add node v to S and edge (u,v) to Tfor every node xVS s.t. (v,x) is ans edge in G if c(v,x)<a(x)then a(x)=c(v,x)ChangeKey(Q,x,a(x),v)\boxed{ \begin{aligned} &Insert(Q, s, 0, \empty) \\ &\text{while } S \neq V \\ &\quad(v, a(v), u) = ExtractMin(Q) \\ &\quad\text{Add node } v \text{ to } S \text{ and edge } (u,v) \text{ to } T \\ &\quad\text{for every node } x \in V - S \text{ s.t. } (v, x) \text{ is ans edge in } G \\ &\qquad\text{ if } c(v,x) < a(x) \text{then }\\ &\qquad\quad a(x) = c(v,x) \\ &\qquad\quad ChangeKey(Q,x,a(x),v) \\ \end{aligned} }
  • QQ is a priority queue
  • Each element in QQ is a triple, the node, its attachment cost, and its predecessor in the MST
  • In step 8, if xx is not already in QQ, simply insert (x, a(x), v) into QQ
  • Total of n1n - 1 ExtractMinExtractMin and mm ChangeKeyChangeKey operations, yielding a running time of O(mlogn)O(m \log n)
  • running time of step 5 is proportional to the degree of xx which is proportional to the number of edges in the graph mm

Kruskal's Algorithm

  • Start with an empty set TT of edges
  • Process edges in EE in increasing order of cost
  • Add the next edge ee to TT only if adding ee does not create a cycle. Discard ee otherwise
  • Note: at any iteration, TT may contain several connected components and each node in VV is in some component
  • Claim: Kruskal's algorithm outputs an MST
    • For every edge ee added, demonstrate the existence of a set SVS \sub V (and VSV - S) such that ee and SS satisfy the cut property, i.e.e, the cheapest edge in cut(S)cut(S)
      • If e=(u,v)e = (u,v), let SS be the set of nodes connected to uu in the current graph TT
      • Why is ee the cheapest edge in cut(SS) - because we process them in increasing order of cost
    • Prove that the algorithm computes a spanning tree
      • (V,T)(V,T) contains no cycles by construction
      • If (V,T)(V,T) is not connected, then there exists a subset SS of nodes not connected to VSV-S. What is the contradiction?

Implementing Kruskal's Algorithm

  • start with an empty set TT of edges
  • Process edges in EE in increasing order of cost
  • Add the next edge ee to TT only if adding ee does not create a cycle
  • Sorting edges takes O(mlogn)O(m \log n) time
  • Key question: "Does adding e=(u,v)e = (u,v) to TT create a cycle?
    • Maintain set of connected components of TT
    • Find(u)Find(u): return the name of the connected component of TT that uu belongs to
    • Union(A,B)Union(A,B): merge connected components A, B

Analysing Kruskal's Algorithm

  • How many FindFind invocations does Kruskal's need? 2m2m
  • How many UnionUnion invocations? n1n-1
  • Two implementations of UnionFindUnion-Find:
    • Each FindFind take O(1)O(1), kk invocations of UnionUnion takes O(klogk)O(k \log k) time in total – O(m+nlogn)O(m + n \log n)
    • Each FindFind take O(logn)O(\log n ), each invocation of UnionUnion takes O(1)O(1)O(mlogn+n)O(m \log n + n)
  • In general m<nm < n, but in either case, the total running time is O(mlogn)O(m \log n) since we have to spend O(mlogn)O(m \log n) time sorting the edges by increasing cost, regardless of the underlying implementation of the UnionFindUnion-Find data structure

Comments on Union-Find and MST

  • useful to maintain connected components of a graph as edges are added to the graph
  • Data structure does not support edge deletion efficiently
  • Current best algorithm for MST runs in O(mα(m,n))O(m\alpha (m,n)) time (Chazelle 2000) where α(m,n)\alpha(m,n) is a function of m,nm,n and O(m)O(m) randomized time
    • Let logn=\log^* n = the number of times you take logn\log n before you reach 1
    • e.g. log(210)=4\log^*(2^{10}) = 4, log(2210)=5\log^*(2^{2^{10}}) = 5
  • Holy grail: O(m)O(m) deterministic algorithm for MST

Cycle Property

  • When can we be sure that an edge cannot be in any MST
  • Let CC be any cycle in GG and let e=(v,w)e = (v,w) be the most expensive edge in CC
  • Claim: ee does not belong to any MST of G
  • Proof: exchange argument.
    • If a supposed MST TT contains ee, show that there is a tree with smaller cost than TT that does not contain ee

Reverse-Delete Algorithm

¯\(ツ)

Observations about MST algorithms

  • Any algorithm that constructs a spanning tree by including the Cut and Cycle properties (include edges that satisfy the cut P, delete edges that satisfy the cycle property) will be an MST

07 - Applications of Minimum Spanning Trees

Minimum Bottleneck Spanning Tree (MBST)

  • MST minimizes the total cost of the spanning network
  • Consider another network design criterion
    • build a network connecting all cities in mountainous region, but ensure the highest elevation is as low as possible
    • total road length is not a criterion
  • Idea: compute an MST in which the edge with the highest cost is as cheap as possible
  • In an undirected graph G(V,E)G(V,E), let (V,T)(V,T) be a spanning tree. The bottleneck edge in TT is the edge with the largest cost in TT
  • Instance: An an undirected graph G(V,E)G(V,E) such that V,TV,T is a spanning tree
  • Solution: A Set TET \sube E of edges such that (V,T)(V,T) is a spanning tree and there is no spanning tree in GG with a cheaper bottleneck

Is every MST and MBST, or vice versa?

  • If a tree is an MBST, it might not be an MST
  • If a tree is an MST, it will be an MBST

Proof:

  • Let TT be the MST, and let TT' be a spanning tree with a cheaper bottleneck edge.
  • Let ee be the bottleneck edge in TT
  • Every edge in TT' has lower cost than ee
  • Adding ee to TT' creates a cycle consisting only of edges, where ee is the costliest edge in the cycle
  • Since ee is the costliest edge in this cycle, by the cycle property, ee cannot belong to any MST, which contradicts the fact that TT is an MST

Motivation for Clustering

  • Given a set of objects and distances between them
  • objects can be anything
  • Distance function: increase distance corresponds to dissimilarity
  • Goal: group objects into clusters, where each cluster is a group of similar objects

Formalizing the Clustering Problem

  • Let UU be the set of nn objects labels p1,p2,,pnp_1, p_2, \mathellipsis, p_n
  • For every pair pi,pjp_i, p_j, we have a distance d(pi,pj)d(p_i, p_j)
  • We require that d(pi,pi)=0d(p_i, p_i) = 0, d(pi,pj)>0d(p_i, p_j) > 0, d(pi,pj)=d(pj,pi)d(p_i, p_j) = d(p_j, p_i),
  • Given a positive integer kk, a k-clustering of UU is a partition of UU into kk non-empty subsets or clusters C1,C2,...,CkC_1, C_2, ..., C_k
  • that spacing of a clustering is the smallest distance between objects in 2 different subsets:
    • spacing(C1,C2,...,Ck)=min1i,jk,ij,pCi,qCjd(p,q)spacing(C_1, C_2, ..., C_k) = \min_{1 \leq i,j \leq k, i\neq j, p \in Ci, q \in C_j} d(p,q)
    • spacing is the minimum of distance between objects in two different subsets
minimum = inifnity
for every cluster C_i
  for every cluster C_j ≠ i
    for every point p in C_i
      for every point q in C_j
        if d(p,q) < minimum
          minimum = d(p,q)

return minimum
  • Clustering of Maximum Spacing
    • Instance: a Set UU of objects, a distance function d:U×UR+d : U \times U \rightarrow \Reals^+
    • Solution: A k-clustering of UU whose spacing is the largest over all possible k-clusterings
    • O(n2)O(n^2) on nn clusters and then O(i×j)O(i \times j) on points in disparate cluster

Algorithm for Max Spacing

  • intuition: greedily cluster objects in increasing order of distance
  • Let C\mathcal C be the set of nn clusters, with each object in UU in its own cluster
  • Process pairs of objects in increasing order of distance
    • Let (p,q)(p,q) be the next pair with pCpp \in C_p and qCpq \in C_p
    • If CpCqC_p \neq C_q add a new cluster CpCqC_p \cup C_q to C\mathcal{C}, delete Cp,CqC_p, C_q from C\mathcal{C}
  • Stop when there are kk cluster in C\mathcal{C}
  • Same as Kruskal's algorithm, but do not add the last k1k-1 edges in MST
  • Given a clustering C\mathcal{C}, what is spacing(C\mathcal{C})?
    • The length of the next edge that would be added - the cost of the (k1)(k-1)st most expensive edge in the MST. Let this cost be dd^*

Proof of optimal computing

  • Let C\mathcal{C'} be any other cluster (with kk clusters).
  • We will prove that spacing(C\mathcal{C'}) d\leq d^*
    • There must be two objects pi,pjUp_i, p_j \in U in the same cluster CrCC_r \in \mathcal{C} but in different clusters in C\mathcal{C'}: spacing(C\mathcal{C'}) d(pi,qi)\leq d(p_i, q_i)
    • Suppose piCsp_i \in C'_s and pjCtp_j \in C'_t in C\mathcal{C}
    • All edges in the path QQ connecting pipjp_i \rightarrow p_j in the MST have lengths d\leq d^*
    • In particular, there is an object pCsp \in C'_s and an object pCsp' \notin C'_s s.t. pp and pp' are adjacent in QQ
    • d(p,p)d    d(p, p') \leq d^* \implies spacing(C\mathcal{C'}) d(p,p)d\leq d(p, p') \leq d^*

Explanation

  • dd^* is the name we give to the variable denoting the spacing produced by our algorithm$
  • The above proof shows that any other clustering will have a smaller spacing

08 - Divide and Conquer Algorithms

Mergesort

  • Instance: nonempty list L=x1,x2,...,xnL = x_1, x_2, ..., x_n integers
  • Solution: A permutation y1,y2,...,yny_1, y_2, ..., y_n of LL such that yiyi+1y_i \leq y_{i+1} for all 1in1 \leq i \leq n
  • Mergesort is a divide and conquer algorithm for sorting
    • Partition LL into two lists A,BA, B of sizes n/2,n/2\lfloor n/2 \rfloor, \lceil n/2 \rceil
    • Recursively sort A,BA, B,
    • Merge the sorted lists A,BA,B into a sorted list

Merging Two Sorted Lists

  • Maintain a current pointer for each list
  • initialize each pointed to the front of the list
  • While both lists are non-empty:
    • let ai,bia_i, b_i be the elements pointed to by each pointer
    • append the smaller of the two to the output list
    • advance the current pointer in the list that the smaller element belonged to
  • append the rest of the non-empty list to the output
  • Running time of this algorithm is O(k+l)O(k+l), k=len(L)k = len(L)

Analysing Mergesort

  • Running time for LL = running time for A+A + running time for B+B + time to split the input into two lists ++ time to merge two sorted lists
    • what is this in the worst case?
    • \leq worst case running time for n/2+n/2+\lfloor n/2 \rfloor + \lceil n/2 \rceil + time to split input into two lists + time to merge two sorted lists
  • This sucks to read, need a shorthand. Let's assume nn is a power of 2
  • Let T(n)T(n) be the worst-case running time for nn elements, n1\forall n \geq 1
  • Worst-case running time:
    • T(n)2T(n/2)+cn,n>2T(n) \leq 2T(n/2) + cn, n > 2, cc is some constant time to split the list
    • T(2)cT(2) \leq c
    • Can assume T(n)=O(nlogn)T(n) = O(n \log n) for problem sets

Analysing Mergesort in the Worst Case

  • Partition LL into two lists AA and BB of size n/2,n/2\lfloor n/2 \rfloor, \lceil n/2 \rceil respectively
  • Recursively sort AA and BB
  • Merge the sorted lists A,BA, B into a single sorted list
  • Worst case running time for nn elements ≤
    • worst case running time for n/2\lfloor n/2 \rfloor +
    • worst case running time for n/2\lceil n/2 \rceil +
    • time to split the input into two lists +
    • time to merge two sorted lists
  • Assuming nn is a power of 2: T(n)2t(n/2)+cn,n>2T(n) \leq 2t(n/2) + cn, \quad n > 2
  • Three basic ways of solving this recurrence relation:
    • "Unrolling" the recurrence
    • Guess a solution and substitute into recurrence to check
    • Guess a solution in O()O( ) form and substitute into recurrence to determine the constants

Unrolling

  • Input to each sub problem on level ii has size n/2in/2^i
  • Recursion tree has logn\log n levels
  • Number of sub-problems on level ii has size 2i2^i
  • Total work done at each level is cncn
  • Running time of the algorithm is cnlogncn \log n

Substituting a Solution into the Recurrence

  • Guess that the solution is T(n)cnlognT(n) \leq cn \log n (log base 2)

  • Use induction to check if the solution satisfies the recurrence relation

  • Base case: n=2n = 2. Is T(2)=c2clog2T(2) = c \leq 2c \log 2? yes.

  • (strong) Inductive Hypothesis: must include n/2n/2

    • Assume T(m)cmlog2mT(m) \leq cm \log_2 m, for all 2m<n2 \leq m < n, therefore
    • T(n2)cn2log(n2)T({n \over 2}) \leq {cn \over 2} \log ({n \over 2})
  • Inductive Step: Prove T(n)cnlognT(n) \leq cn \log n

    T(n)2T(n2)+cn2(cn2log(n2))=cnlog(n2)+cn=cnlogncn+cn=cnlogn\begin{aligned} T(n) &\leq 2T({n \over 2}) + cn \\ &\leq 2({cn \over 2} \log ({n \over 2})) \\ &= cn \log ({n \over 2}) + cn \\ &= cn \log n - cn + cn \\ &= cn \log n \\ \end{aligned}
  • Why is T(n)kn2T(n) \leq kn^2 a "loose" bound?

  • Why doesn't an attempt to prove T(n)knT(n) \leq kn