CS 4104 Algorithms Notes

Published on
183 min read––– views

Contents

Introduction

Notes from CS 4104 with exercises and derivations from "Algorithm Design" by Kleinberg and Tardos.

Glossary

TermDefinition
Matchingeach man is paired with ≤ 1 woman and vice versa
Perfect Matchingeach man is paired with exactly 1 woman and vice versa
Rogue Couplea man and a woman who are not matched, but prefer each other to their
Stable Matchinga perfect matching without any roogue couples
polynomial running timeif there exists constants c,d>0c,d > 0 such that n\forall n, the running time is bounded by cndcn^d
Asymptotic Upper Bounda function f(n)f(n) is O(g(n))O(g(n)) if there exists constants c>0,n00c > 0, n_0 \geq 0 s.t. nn0,f(n)cg(n)\forall n \geq n_0, \quad f(n) \leq cg(n)
Asymptotic Lower Bounda function f(n)f(n) is Ω(g(n))\Omega(g(n)) if there exist constants c>0,n00c > 0, n_0 \geq 0 s.t. nn0,f(n)cg(n)\forall n \geq n_0, \quad f(n) \geq cg(n)
Asymptotic Tight Bounda function f(n)f(n) is Θ(g(n))\Theta(g(n)) if f(n)f(n) is O(g(n))O(g(n)) and f(n)f(n) is Ω(g(n))\Omega(g(n))
Euler PathOnly possible if the number of nodes with odd degree is at most 2
A v1vkv_1 - v_k pathin an undirected graph G=(V,E)G = (V,E) is a sequence of PP nodes v1,v2,,vk1,vk;vkVv_1, v_2, \mathellipsis, v_{k-1}, v_k; v_k \in V s.t. every consecutive pair of nodes vi;vi+1,1i<kv_i; v_{i+1}, 1 \leq i < k is connected by an edge EE
simple pathall its nodes are distinct
cycleis a path where k>2k>2, the first k1k-1 nodes are distinct and v1=vkv_1 = v_k
connected undirected graphfor every pair of nodes u,vVu,v \in V, there is a path from uu to vv in GG
distance d(u,v)d(u,v)the minimum number of edges in any uvu-v path
connected component of GG containing ssthe set of all nodes uu such that there is an sus-u path in GG
Adjaceny Matrixn×nn \times n boolean matrix, where the entry in row ii, col jj is 1 iff the graph contains the edge (i,j)(i, j)
Adjaceny Listarray AdjAdj, where Adj[v]Adj[v] stores the list of all nodes adjacent to vv
bipartiteA graph G=(V,E)G = (V, E) is bipartite if VV can be partitioned into two subsets X,YX,Y such that every edge in EE has one endpoint in XX and one in YY
Greedy algorithmmakes the best current choice without looking back. Assume the choices made prior were perfect
inversionA schedule has an inversion if a job ii with a deadline d(i)d(i) is scheduled before another job jj with an earlier deadline d(j)d(j) i.e. d(j)<d(i),s(i)<s(j)d(j) < d(i), s(i) < s(j)
spanning treeA subset TofET of E is a spanning tree of GG if (V,T)(V, T) is a tree

01 - Stable Matching

Problem

  • Each man ranks all the women in order of preference

  • Each woman ranks all the men in order of preference

  • Each person uses all ranks from 1,...,n1, ... , n s.t. there are no ties or incomplete lists

Male preference matrix:

w1w_1w2w_2w3w_3w4w_4
m1m_11234
m2m_24312
m3m_34312
m4m_43142

Female preference matrix:

m1m_1m2m_2m3m_3m4m_4
w1w_11234
w2w_24132
w3w_34123
w4w_43124
  • A Matching: each man is paired with ≤ 1 woman and vice versa
  • A Perfect Matching: each man is paired with exactly 1 woman and vice versa
    • "perfect" means one-to-one mapping
  • A Rogue Couple: a man and a woman who are not matched, but prefer each other to their current partners
  • A Stable Matching: A perfect matching without any rogue couples

Solution: Gale-Shapley Algorithm

Initially, all men and all women are freeSet S of matched pairs is emptyWhile there is at least one free man who has not proposed to every womanChoose such a man mm proposes to his highest-ranked woman w to whom he has not yet proposedIf w is freeshe becomes engaged to m s.t. SS{m,w}Else if w is engaged to m and she prefers m to mshe becomes engaged to m s.t. SS{m,w}m becomes free s.t. SS\{m,w}Otherwise, m remains freeReturn set S of engaged pairs\boxed{ \begin{aligned} &\text{Initially, all men and all women are free} \\ &\text{Set } S \text{ of matched pairs is empty} \\ &\text{While there is at least one free man who has not proposed to every woman} \\ &\quad\text{Choose such a man } m \\ &\qquad m \text{ proposes to his highest-ranked woman } w \text{ to whom he has not yet proposed} \\ &\quad\text{If } w \text{ is free} \\ &\qquad\text{she becomes engaged to } m \text{ s.t. } S \leftarrow S \cup \{m, w\} \\ &\quad\text{Else if } w \text{ is engaged to } m' \text{ and she prefers } m \text{ to } m' \\ &\qquad\text{she becomes engaged to } m \text{ s.t. } S \leftarrow S \cup \{m, w\} \\ &\qquad m' \text{ becomes free s.t. } S \leftarrow S \backslash \{m', w\} \\ &\quad\text{Otherwise, } m \text{ remains free} \\ &\text{Return set } S \text{ of engaged pairs} \end{aligned} }

Proof of Perfection

  1. Suppose the set SS of pairs returned by Gale-Shapley algorithm is not perfect
  2. SS is a matching, therefore there must be at least one free man mm
  3. mm has proposed to all women (since the algorithm terminated)
  4. Therefore, each woman must be engaged (since she remains engaged after the first proposal to her)
  5. Therefore, all men must be engaged, contradicting ⚔️ the assumption that mm is free
  6. That matching is perfect since the program terminated QED.

Proof of Stability

  1. Suppose SS is not stable i.e. there are two pairs (m1,w1)(m_1, w_1), (m2,w2)S(m_2, w_2) \in S s.t m1m_1 prefers w2>w1w_2 > w_1 and w2w_2 prefers m1>m2m_1 > m_2
  2. m1m_1 must have proposed to w2w_2 prior to w1w_1 because at that stage w2w_2 must have rejected m1m_1; otherwise she would have been paired with w1w_1, which would in turn prevent the pairing of (m2,w2)(m_2, w_2) in a later iteration
  3. When w2w_2 rejected m1m_1, she must have been paired with some man m3>m1m_3 > m_1
  4. Since m2m_2 is paired with w2w_2 at termination, w2w_2 must prefer m2m_2 to m3m_3 or m2=m3m_2 = m_3
  5. This contradicts ⚔️ our conclusion (from instability) that w2w_2 prefers m1>m2m_1 > m_2 QED.

Observations

  • This algorithm computes a matching i.e. each woman gets pair with at most one man and vice versa
  • The man's status can alternate between being free and being engaged
  • The woman's status remains engaged after the first proposal
  • The ranking of a man's partner remains the same or goes down
  • The ranking of a woman's partner can never go down
  • The number of proposals made after kk iterations is the best indicator of progress
  • The max number of total proposals (or iterations) that can be made is n2n^2
  • Always produces the same matching in which each man is paired with his best, valid partner, but each woman is paired with her worst, valid partner

02 - Analysis of Algorithms

Polynomial Time

  • Essentially brute forcing
  • e.g. brute force sorting:
    • Given nn numbers, permute them so that they appear in increasing order
    • Try all n!n! permutations
    • For each permutation, check if it is sorted
    • O(nn!)O(nn!)
  • Desirable scaling property: when the input size doubles, the algorithm should only slow down by some constant cc
  • An algorithm has polynomial running time if there exists constants c,d>0c,d > 0 such that n\forall n, the running time is bounded by cndcn^d
  • an algorithm is said to be efficient if it has a polynomial running time

Upper and Lower Bounds

  • Asymptotic Upper Bound: a function f(n)f(n) is O(g(n))O(g(n)) if there exists constants c>0,n00c > 0, n_0 \geq 0 s.t. nn0,f(n)cg(n)\forall n \geq n_0, \quad f(n) \leq cg(n)
    • e.g. 100nlog2n100n \log_2n is O(n2)O(n^2) for c=100,n0=1c = 100, n_0 = 1
  • Asymptotic Lower Bound: a function f(n)f(n) is Ω(g(n))\Omega(g(n)) if there exist constants c>0,n00c > 0, n_0 \geq 0 s.t. nn0,f(n)cg(n)\forall n \geq n_0, \quad f(n) \geq cg(n)
    • e.g. n10log2n\frac{n}{10}\log_2 n is Ω(n)\Omega(n) for c=1,n0=1024c = 1, n_0 = 1024

In general, we attempt to find a sufficiently large n0n_0 such that our upper and lower bounds are satisfied.

  • Asymptotic Tight Bound: a function f(n)f(n) is Θ(g(n))\Theta(g(n)) if f(n)f(n) is O(g(n))O(g(n)) and f(n)f(n) is Ω(g(n))\Omega(g(n))

Transitively:

  • if f=O(g)f = O(g) and g=O(h)g = O(h), then f=O(h)f = O(h)
  • if f=Ω(g)f = \Omega(g) and g=Ω(h)g = \Omega(h), then f=Ω(h)f = \Omega(h)
  • if f=Θ(g)f = \Theta(g) and g=Θ(h)g = \Theta(h), then f=Θ(h)f = \Theta(h)

Additively:

  • if f=O(h)f = O(h) and g=O(h)g = O(h), then f+g=O(h)f + g = O(h)
  • Similar statements hold for lower and tight bounds
  • if kk is a constant and there are kk functions: fi=O(h),1ikf_i = O(h), 1\leq i \leq k
f1+f2+...+fk=O(h)\begin{aligned} f_1 + f_2 + ... + f_k = O(h) \end{aligned}
  • if f=O(g)f = O(g), then f+g=Θ(g)f + g = \Theta(g)

Examples

p,q,n>0p,q,n > 0

f(n)f(n)g(n)g(n) Reason
pn2+qn+rpn^2 + qn + rΘ(n2)\Theta(n^2)pn2+qn+r(p+q+r)n2pn^2 + qn + r \leq (p + q + r)n^2 , and we can ignore low order terms
pn2+qn+rpn^2 + qn + rO(n3)O(n^3)n2n3, if n1n^2 \leq n^3, \text{ if } n \geq 1
0idaini\displaystyle\sum_{0 \leq i \leq d} a_i n^iΘ(nd)\Theta(n^d)if d>0d > 0 is an integer constant and ad>0a_d > 0
O(n1.59)O(n^{1.59})polynomial timen1.59n^{1.59} is O(n2)O(n^2)
logan\log_a nO(logbn)O(\log_b n)log2(x)=log10(x)log10(2)a,b>1\log_2(x) = \frac{\log_{10}(x)}{\log_{10}(2)} \forall a,b > 1 therefore the base is irrelevant
  •  constant x>0,logn=O(nx)\forall \text{ constant } x > 0, \log n = O(n^x) e.g. logn=n0.00001\log n = n^{0.00001}
  •  constant r>1,d>0,nd=O(rn)\forall \text{ constant } r > 1, d >0, n^d = O(r^n) e.g. n3=O(1.1n)n^3 = O(1.1^n)

03 - Review of Graphs and Priority Queues

Priority Queues: Motivation

Sorting

  • Instance: Nonempty list [x1,x2,,xn][x_1, x_2, \mathellipsis, x_n] of integers
  • Solution: A permutation [y1,y2,,yn][y_1, y_2, \mathellipsis, y_n] of x1,x2,,xnx_1, x_2, \mathellipsis, x_n such that yiyi+1y_i \leq y_{i+1} for all $1 \leq i < n$$

Possible algorithm:

  • insert each number into some data structure DD
  • Repeatedly find the smallest number in DD, output it, and remove it
  • To get O(nlogn)O(n \log n) running time, each "find minimum" step and each remove step must additively take O(logn)O(\log n)

Priority Queues

  • Store a set SS of elements where each element vv has a priority value key(v)key(v)
  • smaller key values = higher priorities
  • Supported operations:
    • find element with smallest key
    • remove smallest element
    • insert an element
    • delete an element
    • update the key of an element
  • element deletion and key update require knowledge of the position of the element in the priority
  • combines benefits of both lists and sorted arrays
  • balanced binary tree
  • heap order. For every element vv at a node ii, the element ww at ii's parent satisfies key(w)key(v)key(w) \leq key(v)
  • Each node's key is at least as large as its parent's
  • storing nodes of the heap in an array:
    • Node at index ii has children at indices 2i2i and 2i+12i+1, with a parent at index i2\lfloor \frac{i}{2} \rfloor
    • index 1 is the root
    • if 2i>n2i > n, where nn is the current number of elements in the heap, the node at index ii is a leaf

Inserting an Element: Heapify-up

  • insert a new element at index n+1n+1
  • Fix heap order using Heapify-up(H, n+1)
Heapify-up(H, i)If i>1 thenlet j=parent(i) =i2If key[H[i]] < key[H[j]] then swap entries H[i], H[j]Heapify-up(H, j)\boxed{ \begin{aligned} &\text{Heapify-up(H, i)} \\ &\text{If } i > 1 \text{ then} \\ &\quad\text{let } j = \text{parent(i) } = \lfloor \frac{i}{2} \rfloor \\ &\quad\text{If key[H[i]] < key[H[j]] then } \\ &\qquad\text{swap entries H[i], H[j]} \\ &\qquad\text{Heapify-up(H, j)} \\ \end{aligned} }

Proof of Running Time Complexity for Heapify-up(i)

  • each invocation decreases the second argument by a factor of at least 2
  • after kk invocations, argument is at most i2k\frac{i}{2^k}
  • Therefore i2k1\frac{i}{2^k} \geq 1 which implies that klog2ik \leq \log_2 i therefore heapify up is O(logi)O(\log i)

Deleting an Element: Heapify-down

  • Suppose HH has n+1n+1 elements
  • Delete element H[i]H[i] moving element at H[n+1]H[n+1] to H[i]H[i]
  • If element at H[i]H[i] is too small, fix the heap order using Heapify-up(H, i)
  • If element at H[i]H[i] is too large, fix heap order using Heapify-down(H, i)
Heapify-down(H, i)let n=length(H)If 2i>n thenTerminate with H unchangedElse If 2i<n thenLet Left, Right =2i,2i+1Let j be the index that minimizes key[H[Left]] and key[H[Right]]Else 2i=n thenLet j=2iIf key[H[j]] < key[H[i]] then swap the entries H[i] and H[j]Heapify-down(h,j)\boxed{ \begin{aligned} &\text{Heapify-down(H, i)} \\ &\text{let } n = length(H) \\ &\text{If } 2i > n \text{ then} \\ &\quad\text{Terminate with } H \text { unchanged} \\ &\text{Else If } 2i < n \text{ then} \\ &\quad\text{Let Left, Right } = 2i, 2i+1 \\ &\quad\text{Let } j \text{ be the index that minimizes key[H[Left]] and key[H[Right]]} \\ &\text{Else } 2i = n \text{ then} \\ &\quad\text{Let } j = 2i \\ &\text{If key[H[j]] < key[H[i]] then } \\ &\quad\text{swap the entries H[i] and H[j]} \\ &\quad\text{Heapify-down(h,j)} \\ \end{aligned} }

Proof of Running Time Complexity for Heapify-down(i)

  • Each invocation increases is second argument by a factor of at least 2
  • after kk invocations arguments must be at least i2kni2^k \leq n, which implies that klog2nik \leq \log_2 \frac{n}{i}
  • Therefore running time is O(log2ni) O(\log_2 \frac{n}{i})

Sorting

  • Instance: Nonempty list [x1,x2,,xn][x_1, x_2, \mathellipsis, x_n] of integers
  • Solution: A permutation [y1,y2,,yn][y_1, y_2, \mathellipsis, y_n] of x1,x2,,xnx_1, x_2, \mathellipsis, x_n such that yiyi+1y_i \leq y_{i+1} for all $1 \leq i < n$$

Final algorithm:

  • Insert each number in a priority queue HH
  • Repeatedly find the smallest number in HH, output it, and delete it from HH

Thus, each insertion and deletion take O(logn)O(\log n) for a total running time of O(nlogn)O(n \log n)


Graphs: Motivation

  • Contact tracing hahaaaaa

Taxonomy of a Graph

  • comprised of vertices and (directed / undirected) edges, they can form face.

  • Euler Path: Only possible if the number of nodes with odd degree is at most 2

  • Formally, an Undirected graph G=(V,E)G = (V, E) where V,EV, E are sets of vertices and edges with EV×VE \subseteq V \times V

    • Elements of EE are unordered pairs
    • Edge(u,v)Edge(u, v) is incident on u,vu,v meaning u,vu,v are neighbors of one another
    • Exactly one edge between any pair of nodes
    • GG contains no self loops, i.e. no edges of the from (u,u)(u,u)
  • Formally, an directed graph G=(V,E)G = (V, E) where V,EV, E are sets of vertices and edges with EV×VE \subseteq V \times V

    • Elements of EE are ordered pairs
    • e=(u,v)e = (u, v) where uu is the tail of the edge ee, vv is the head, and we can say that ee is directed from uu to vv
    • A pair of nodes may be connected by two directed edges (u,v)(u,v) and (v,u)(v,u)
    • GG contains no self loops
  • A v1vkv_1 - v_k path in an undirected graph G=(V,E)G = (V,E) is a sequence of PP nodes v1,v2,,vk1,vk;vkVv_1, v_2, \mathellipsis, v_{k-1}, v_k; v_k \in V s.t. every consecutive pair of nodes vi;vi+1,1i<kv_i; v_{i+1}, 1 \leq i < k is connected by an edge EE

  • a path is simple if all its nodes are distinct

  • a cycle is a path where k>2k>2, the first k1k-1 nodes are distinct and v1=vkv_1 = v_k

  • An undirected graph GG is connected if, for every pair of nodes u,vVu,v \in V, there is a path from uu to vv in GG

  • The distance d(u,v)d(u,v) between nodes u,vu,v is the minimum number of edges in any uvu-v path

  • The connected component of GG containing ss is the set of all nodes uu such that there is an sus-u path in GG

Computing Connected Components

  • Rather than computing the connected component of GG that contains ss and check if tt is in that component, we can "explore" GG starting from ss, maintaining a set RR of visited nodes
Rwill consist of nodes to which shas a pathInitially R={s}While there is ad edge (u,v) where uR,vRAdd v to R\boxed{ \begin{aligned} &R \text{will consist of nodes to which } s \text{has a path} \\ &\text{Initially } R = \{s\} \\ &\text{While there is ad edge } (u,v) \text{ where } u \in R, v \notin R \\ &\quad\text{Add } v \text{ to } R \end{aligned} }
  • Explore GG starting at ss and going "outward" in all directions, adding nodes one "layer" at a time
  • Layer L0L_0 only contains ss
  • Layer L1L_1 contains ss and all its neighbors, etc. etc.
  • Layer L0,L1,,Lj,Lj+1L_0, L_1, \mathellipsis , L_j, L_{j+1} contains all nodes that:
    • do not belong to an earlier layer
    • are connected by an edge to a node in layer LjL_j
    • The shortest path from ss to each node contains jj edges
  • Claim: For each j1j \geq 1, layer LjL_j consists of all nodes exactly at distance jj from SS
  • A non-tree edge is an edge of GG that does not belong to the BFS tree TT

Proof

  • There is a path from ss to tt if an only iff tt is a member of some layer
  • Let vv be a node in layer Lj+1L_{j+1} and $$ubethe"first"nodeinbe the "first" node inL_jsuchthatsuch that(u,v)isanedgeinis an edge inG.Considerthegraph. Consider the graph Tformedbyalledges,directedfromformed by all edges, directed fromutotov$
    • Notice that TT is a tree because it is connected and the number of edges in TT is the number of nodes in all the laters minus one
    • TT is called the BFS Tree

Inductive Proof of BFS Distance Property

  • for every j0j \geq 0, for every node uLju \in L_j, d(s,u)=jd(s, u) = j
  • Basis: k=0k =0, d(s,s)=0d(s,s) = 0
  • Inductive Hypothesis: Assume the claim is true for every node vLkv \in L_k, d(s,v)=kd(s, v) = k
  • Inductive Step: Prove for every node xLk+1x \in L_{k+1}, d(s,x)=k+1d(s, x) = k + 1
    • d(s,x)=d(s,y)+1d(s,x) = d(s, y) + 1 if yy is a node in LkL_k
    • Therefore d(s,x)=k+1d(s,x) = k+1
  • Explore GG as if it were a maze: start from ss, traverse first edge out (to node vv), traverse first edge out of vv, . . . , reach a dead-end, backtrack, repeat
Mark uas explored and add it to the reachable setRFor each edge (u,v)incident to uIf v is not marked, invoke DFS(v)\boxed{ \begin{aligned} &\text{Mark } u \text{as explored and add it to the reachable set} R \\ &\text{For each edge } (u,v) \text{incident to } u \\ &\quad\text{If } v \text{ is not marked, invoke DFS(} v \text{)} \\ \end{aligned} }

Properties of each

  • BFS and DFS visit the same set of nodes, but in a different order

Representing Graphs

  • A Graph G=(V,E)G = (V,E) has two input parameters: V=n,E=m|V| =n, |E| = m, with n1m(n2) n-1 \leq m \leq \binom{n}{2}
    • Size of graph is defined to be m+nm + n
    • Strive for algorithms whose running time is linear in graph size i.e. )(m+n))(m + n)
  • We assume that V=1,2,,nV = {1, 2, \mathellipsis, n}
  • Use an Adjaceny Matrix: n×nn \times n boolean matrix, where the entry in row ii, col jj is 1 iff the graph contains the edge (i,j)(i, j)
    • the space used is Θ(n2)\Theta(n^2), which is optimal in the worst case
    • can check if there is an edge between nodes ii, jj in O(1)O(1) time
    • iterate over all the edges incident on node ii in Θ(n)\Theta(n) time
  • Use an Adjaceny List: array AdjAdj, where Adj[v]Adj[v] stores the list of all nodes adjacent to vv
    • an edge e=(u,v)e = (u,v) appears twice: Adj[u],Adj[u], Adj[v]$
    • nvn_v is the number of neighbors of node vv
    • space used is O(vGnv)=O(m+n)O(\displaystyle\sum_{v \in G}n_v) = O(m+n)
    • check if there is an adge between nodes u,vu, v in O(nu)O(n_u) time
    • Iterate over all the edges incident on node uu in Θ(nu)\Theta(n_u) time
    • Inserting an edge takes O(1)O(1) time
    • Deleting an edge takes O(n)O(n) time
Operation/SpaceAdj. MatrixAdj. List
Is (i,j)(i, j) an edge?O(1)O(1) timeO(ni)O(n_i) time
Iterate over edges incident on node iiO(n)O(n) timeO(ni)O(n_i) time
Space usedO(n2)O(n^2)O(m+n)O(m + n)

04 - Linear-Time Graph Algorithms

Bipartite Graphs

  • A graph G=(V,E)G = (V, E) is bipartite if VV can be partitioned into two subsets X,YX,Y such that every edge in EE has one endpoint in XX and one in YY

    • X×XE=X \times X \cap E = \emptyset and Y×YE=Y \times Y \cap E = \emptyset
    • Color the nodes in XX red and the nodes in YY blue. No edge in EE connects nodes of the same graph
  • no cycle with an odd number of nodes is bipartite, similarly all cycles of even length are bipartite

  • Therefore, if a graph is bipartite, then it cannot contain a cycle of odd length

Algorithm for Testing Bipartite

  • Assume GG is connected, otherwise apply the algorithm to each connected component separately
  • Pick an arbitrary node ss and color it red.
  • Color all it's neighbors blue. Color the uncolored neighbors of these nodes red, and so on till all nodes are colored
  • Check if every edge has endpoints of different colours

more formally:

  1. Run BFS on GG. Maintain an array for the color
  2. When we add a node vv to a layer ii, set color[i]\text{color[i]} to red if ii is even, blue of odd
  3. At the end of BFS, scan all the edges to check if there is any edge both of whose endpoints received the same color

Running time is linear proportional to the size of the graph: O(m+n)O(m + n) since we do a constant amount of work per node in addition to the time spent by BFS.

Proof of Correctness of a "two-colorable" algorithm

Need to prove that if the algorithm says GG is bipartite, then it is actually bipartite AND need to prove that if GG is not bipartite, can we determine why?

  • Let GG be a graph and let L0,L1,,LkL_0, L_1, \mathellipsis, L_k be the layers produced by the BFS, starting at node ss. Then exactly one of the following statements is true:
    • No edge of GG joins two nodes in the same layer: Bipartite since nodes in the even laters can be colored red and those in odd blue
    • There is an edge of GG that joins two nodes in the same layer: Not Bipartite
    • LiLj=1,LBFS| L_i - L_j | = 1, \quad \forall L \in BFS

05 - Greedy Algorithms

  • Greedy algorithms: make the best current choice without looking back. Assume the choices made prior were perfect

Example Problem: Interval Scheduling

  • At an amusement park, want to compute the largest number of rides you can be on in one day

  • Input: start and end time of each ride

  • Constraint: cannot be in two places at one time

  • Instance: nonempty set {(s(i),f(i)),1in}\{(s(i), f(i)), 1 \leq i \leq n\} of start and finish times of nn jobs

  • Solution: The largest subset of mutually compatible jobs

  • Two jobs are compatible if they do not overlap

  • Problem models the situation where you have a resource, a set of fixed jobs, and you want to schedule as many jobs as possible

  • For any input set of jobs, the algorithm must provably compute the largest set of compatible jobs (measured by interval count, not cumulative interval length)

Template for a Greedy Algorithm

  • Process jobs in some order. Add next job to the result if it is compatible with the jobs already in the result
  • key question: in what order should we process job?
    • earliest start time: increasing order of start time s(i)s(i)
    • earliest finish time: increasing order of start time f(i)f(i)
    • shortest interval: increasing order of f(i)s(i)f(i) - s(i)
    • fewest conflicts: increasing order of the number of conflicting jobs

Earliest Finish Time

  • the most optimal general solution
Initially let R be the set of all jobs, and let A be emptyWhile R is not yet emptyChoose a job iR that has the smallest finishing time Add request i to ADelete all jobs from R that are not compatible with job iReturn the set A as the set of accepted/scheduled jobs\boxed{ \begin{aligned} &\text{Initially let } R \text{ be the set of all jobs, and let } A \text{ be empty} \\ &\text{While } R \text{ is not yet empty}\\ &\quad\text{Choose a job } i \in R \text{ that has the smallest finishing time } \\ &\quad\text{Add request } i \text{ to } A \\ &\quad\text{Delete all jobs from } R \text{ that are not compatible with job } i \\ &\text{Return the set } A \text{ as the set of accepted/scheduled jobs} \\ \end{aligned} }

Proof of Optimality

  • Claim A|A| is a compatible set of jobs that is the largest possible in any set of mutually compatible jobs
  • Proof by contradiction that there's no "better" solution at each step of the algorithm
    • need to define "better", and what a step is in terms of the progress of the algorithm
    • order the output in terms of increasing finish time for a measure of progress
  • Finishing time of a job rr selected by AA must be less than or equal to the finishing time of a job rr selected by any other algorithm
  • Let OO be an optimal set of jobs. We will show that A=O|A| = |O|
  • Let i1,i2,,iki_1, i_2, \mathellipsis, i_k be the set of jobs in AA in order of finishing time
  • Let j1,j2,,jmj_1, j_2, \mathellipsis, j_m be the set of jobs in OO in order of mkm \geq k
  • Claim: for all indices rk,f(ir)f(jr)r \leq k, \quad f(i_r) \leq f(j_r)
  • Base case: is it possible for finish time of the first job of our algorithm to be later than the opposing job?
    • f(i1)>f(j1)f(i_1) > f(j_1) is not possible, only f(i1)f(j1)f(i_1) \leq f(j_1)
  • Inductive Hypothesis: this is always the case for any generic job index rr:
    • f(ir)f(jr)f(i_r) \leq f(j_r) for some rkr \leq k
  • Inductive Step: show that the same claim holds for the next job:
    • f(ir+1)f(jr+1)f(i_{r+1}) \leq f(j_{r+1}) for some r+1kr+1 \leq k
    • s(jr+1)f(ir)f(ir)s(j_{r+1}) \geq f(i_r) \geq f(i_r)
  • claim m=km = k: f(ik)f(jk)s(jk+1)f(i_{k}) \leq f(j_{k}) \leq s(j_{k+1})

A complete proof can be found here:

Proof for Optimality of Earliest Finish Time Algorithm for Interval Scheduling

Implementing EFT

  • Sort jobs in order of increasing finish time
  • Store starting time of jobs in an array SS
  • k=1k=1
  • While kSk \leq |S|
    • output job kk
    • Let the finish time of job kk be ff
    • Iterate over SS from kk onwards to find the first index ii s.t. S[i]fS[i] \geq f
    • k=ik = i
  • Must be careful to iterate over SS s.t. we never scan same index more than once
  • Running time is O(nlogn)O(n \log n) since it's dominated by the first sorting step

Scheduling to Minimize Lateness

  • Suppose a job ii has a length t(i)t(i) and a deadline d(i)d(i)
  • want to schedule all nn jobs on one resource
  • Goal is to assign a starting time s(i)s(i) to each job such that each job is delayed as little as possible
  • A job ii is delayed if f(i)>d(i)f(i) > d(i); the lateness of job is max(0,f(i)d(i))\max(0, f(i) - d(i))
  • the lateness of a schedule is max1in(max(0,f(i)d(i)))\max \limits_{1 \leq i \leq n}(\max(0, f(i) - d(i)))
    • the largest of each job's lateness values

Minimizing Lateness

  • Instance: Set {(t(i),d(i)),1in}\{ (t(i), d(i)), 1 \leq i \leq n \} of lengths of deadlines of nn jobs
  • Solution: Set {s(i),1in}\{ s(i), 1 \leq i \leq n \} such that max1in(max(0,f(i)d(i)))\max \limits_{1 \leq i \leq n}(\max(0, f(i) - d(i))) is as small as possible

Template for Greedy Algorithm

  • Key question: In what order should we schedule the jobs:

    • Shortest length: increasing order of length t(i)t(i). Ignores deadlines completely, shortest job may have a very late deadline:
    ii12
    t(i)t(i)110
    d(i)d(i)10010
    • Shortest slack time: Increasing order of d(i)t(i)d(i) - t(i). Bad for long jobs with late deadlines. Job with smallest slack may take a long time:
    ii12
    t(i)t(i)110
    d(i)d(i)210
    • Earliest Deadline: Increasing order deadline d(i)d(i). Correct? Does it make sense to tackle jobs with earliest deadlines first?

Proof of Earliest Deadline Optimality

Order the jobs in order of increase deadlines d(i)Assume for simplicity of notation that d(1)d(n)Initially, f=0Consider the jobs i=1,,ninthisorderAssign the job i to the time interval from s(i)=f to f(i)=f+tiff+tiReturn the set of scheduled intervals[s(i),f(i)], for i=1,,n\boxed{ \begin{aligned} &\text{Order the jobs in order of increase deadlines } d(i) \\ &\text{Assume for simplicity of notation that } d(1) \leq \mathellipsis d(n) \\ &\text{Initially, } f = 0 \\ &\text{Consider the jobs } i=1, \mathellipsis, n { in this order }\\ &\quad\text{Assign the job } i \text{ to the time interval from } s(i) = f \text{ to } f(i) = f + t_i\\ &\quad f \leftarrow f + t_i \\ &\text{Return the set of scheduled intervals} [s(i), f(i)], \text{ for } i = 1, \mathellipsis, n\\ \end{aligned} }

Inversions

  • A schedule has an inversion if a job ii with a deadline d(i)d(i) is scheduled before another job jj with an earlier deadline d(j)d(j) i.e. d(j)<d(i),s(i)<s(j)d(j) < d(i), s(i) < s(j)
    • if two jobs have the same deadline, they cannot cause an inversion
  • in nn jobs, the maximum amount of inversion is (n2)n \choose 2
  • Claim: if a schedule has and inversion, then there must be a pair of jobs i,ji, j such that jj is scheduled immediately after ii and d(j)<d(i)d(j) < d(i)

Proof of Local Inversion

  • If we have an inversion between l,ml, m s.t. s(l)<s(m),d(l)>d(m)s(l) < s(m), d(l) > d(m), then we can find some inverted i,ji,j scheduled between l,ml,m
  • This is because: in a list where each element is greater than the last (in terms of deadline), then the list is sorted.
  • The contrapositive of this is: if a list is unsorted, then there are two adjacent elements that are unsorted.

Properties of Schedules

  • Claim 1: The algorithm produces a schedule with no inversion and no idle time (i.e. jobs are tightly packed)
  • Claim 2: All schedules (produced from the same input) with no inversions have the same lateness
    • Case 1: All jobs have distinct deadlines. There is a unique schedule with no inversions and no idle time.
    • Case 2: Some jobs have the same deadline. Ordering of the jobs does not change the maximum lateness of these jobs.
  • Claim 3: There is an optimal schedule with no idle time.
  • Claim 4: There is an optimal schedule with no inversions and no idle time.
    • Start with an optimal schedule OO (that may have inversions) and use an exchange argument to convert OO into a schedule that satisfies Claim 4 and has lateness not larger than OO.
    • If OO has an inversion, let i,ji, j be consecutive inverted jobs in OO. After swapping i,ji, j we get a schedule OO' with one less inversion.
    • Claim: The lateness of OO' is no larger than the lateness of OO
    • It is sufficient to prove the last item, since after (n2)n \choose 2 swaps, we obtain a schedule with no inversions whose lateness is no larger than that of OO
    • In OO, assume each job rr is scheduled for the interval [s(r),f(r)][s(r), f(r)] and has lateness l(r)\mathcal l(r). For OO', let the lateness of rr be l(r)l'(r)
      • Claim: l(k)=l(k),ki,jl'(k) = l(k), \quad \forall k \neq i,j
      • Claim: l(j)=l(j)l'(j) = l(j),
      • Claim: l(j)l(j)l'(j) \leq l(j) because l(j)=f(j)dif(j)dj=l(j)l'(j) = f(j) - d_i \leq f(j) - d_j = l(j)
    • N.B. common mistakes with exchange arguments:
      • Wrong: start with algorithm's schedule A and argue it cannot be improved by swapping two jobs
      • Correct: Start with an arbitrary schedule O (which can be optimal) and argue that O can be converted into a schedule that is essentially the same as A without increasing lateness
      • Wrong: Swap two jobs that are not neighboring in OO. Pitfall is that the completion time of all intervening jobs then changes
      • Correct: Show that an inversion exists between two neighboring jobs and swap them
  • Claim 5: The greedy algorithm produces an optimal schedule, follows from 1, 2, 4

Summary

  • Greedy algorithms make local decisions
  • Three strategies:
    • Greedy algorithm stays ahead - Show that after each step in the greedy algorithm, its solution is at least as good as that produced by any other algorithm
    • Structural bound - First, discover a property that must be satisfied by every possible solution. Then show that the (greedy) algorithm produces a solution with this property
    • Exchange argument - Transform the optimal solution in steps into the solution by the greedy algorithm without worsening the quality of the optimal solution

06 - Greedy Graph Algorithms

The Shortest Path Problem

  • G(V,E)G(V,E) is a connected, directed graph. Each edge has a length l(e)0l(e) \geq 0
  • length of a path PP us the sum of the lengths of the edges in PP
  • Goal: compute the shortest path from a specified start node ss to every other node in the graph
  • Instace: A directed graph G(V,E)G(V,E), a function I:ER+I: E \rightarrow \reals^+ and a node sVs \in V
  • Solution: A set {Pu,uV}\{P_u, u \in V\} of paths, where PuP_u is the shortest path in GG from ss to uu

Generalizing BFS

  • If all edges have the same wight, or distance, BFS would work, processing nodes in order of distance.
  • What if the graph has integer edge weights, can we make the graph unweighted?
    • yes, placing dummy edges and nodes to pad out lengths > 1 at the expense of memory and running time
    • Edge weight of ww gets w1w - 1 nodes
    • Size of the graph (and therefore runtime) becomes m+n+eEl(e)m + n + \sum_{e \in E} l(e). Pseudo-polynomial time depending on input values

Dijkstra's Algorithm

  • Famous for pointing out the harm of the goto command, in doing so developed the shortest path algorithm
  • Like BFS: explore nodes in non-increasing order of distance ss. Once a node is explored, its distance is fixed
  • Unlike BFS: Layers are not uniform, edges are explored by evaluating candidates by edge wight
  • For each unexplored node, determine "best" preceding explored node. Record shortest path length only through explored nodes
  • Like BFS: Record previous node in path, and build a tree.

Formally:

  • Maintain a set SS of explored nodes
    • For each node uSu \in S, compute d(u)d(u), which (we will prove, invariant) is the length of the shortest path from ss to uu
    • For each node xSx \notin S, maintain a value d(x)d'(x), which is the length of the shortest path from ss to xx using only the nodes in SS and xx itself
  • Greedily add a node vv to SS that has the smallest value of d(v)d'(v)
S={s} and d(s)=0while SVfor every node xVSSet d(x)=min(u,x):uS(d(u)+l(u,x))Set v=argminxVS(d(x))Add v to S and set d(v)=d(v)\boxed{ \begin{aligned} &S = \{s\} \text{ and } d(s) = 0 \\ &\text{while } S \neq V \\ &\quad\text{for every node } x \in V - S \\ &\qquad\text{Set } d'(x) = \min_{(u, x): u \in S} (d(u) + l(u, x)) \\ &\quad\text{Set } v = \arg \min_{x \in V - S }(d'(x)) \\ &\quad\text{Add } v \text{ to } S \text{ and set } d(v) = d'(v)\\ \end{aligned} }

What does v=argminxVS(d(x))v = \arg \min_{x \in V - S }(d'(x)) mean?

  • Run over all unexplored nodes xVSx \in V - S
  • examine all dd' values for each node
  • Return the argument (i.e. the node) that has the smallest value of d(x)d'(x)

To compute the shorts paths: when adding a node vv to SS, store the predecessor uu that minimizes d(v)d'(v)

Proof of Correctness

  • Let PuP_u be the path computed by the algorithm for an arbitrary node uu

  • Claim: PuP_u is the shortest path from ss to uu

  • Prove by induction on the size of SS

    • Base case: S=1|S| = 1. The only node in SS is ss
    • Inductive Hypothesis: S=k|S| = k for some k1k \geq 1. The algorithm has computed PuP_u for every node uSu \in S. Strong induction.
    • Inductive step: S=k+1|S| = k+1 because we add the node vv to SS. Could the be a shorter path PP from ss to vv? We must prove this is not the case.
      • poll: PP' must contain an edge from x to y where x is explore (in S) and y is unexplored (in V - S)
      • poll: The node v in P' must be explored (in S)
  • Locate key nodes on PP'

    • Break PP' into sub-paths from ss to xx, xx to yy, yy to vv - use ll to denote the lengths of the sub-paths in PP' - d(x)l(s,x)d(x) \leq l(s, x) - d(u)+l(u,v)d(x)+l(x,y)d(u) + l(u,v) \leq d(x) + l(x, y) - 0l(y,v)0 \leq l(y,v) - d(u)+l(u,v)=d(v)l(P)=l(s,x)+l(x,y)+l(y,v)d(u) + l(u,v) = d(v) \leq l(P') = l(s,x) + l(x,y) + l(y,v)

Observations about Dijkstra's Algorithm

  • As described, it cannot handle negative edge lengths
  • Union of shortest paths output by Dijkstra's forms a tree: why?
  • Union of the shortest paths from a fixed source ss forms a tree; paths not necessarily computed by computed by Dijkstra's

Running time of Dijkstra's

S={s} and d(s)=0while SVfor every node xVSSet d(x)=min(u,x):uS(d(u)+l(u,x))Set v=argminxVS(d(x))Add v to S and set d(v)=d(v)\boxed{ \begin{aligned} &S = \{s\} \text{ and } d(s) = 0 \\ &\text{while } S \neq V \\ &\quad\text{for every node } x \in V - S \\ &\qquad\text{Set } d'(x) = \min_{(u, x): u \in S} (d(u) + l(u, x)) \\ &\quad\text{Set } v = \arg \min_{x \in V - S }(d'(x)) \\ &\quad\text{Add } v \text{ to } S \text{ and set } d(v) = d'(v)\\ \end{aligned} }
  • VV has nn nodes and mm edges, so their are n1n-1 iterations on the while loops
  • In each iteration, for each node xVSx \in V - S, compute d(x)=min(u,x):uS(d(u)+l(u,x))d'(x) = \min_{(u, x): u \in S} (d(u) + l(u, x)) which is proportional the the number of edges incident on xx
  • Running time per iteration is O(m)O(m), since the algorithm processes each edge (u,x)(u,x) in the graph exactly once (when computing d(x)d'(x))
  • Therefore, the overall running time is O(mn)O(mn)

Optimizing our Algorithm

  • Observation, if we add vv to SS, d(x)d'(x) changes only if (v,x)(v, x) is an edge in GG
  • Idea: For each node xVSx \in V - S, store the current value of d(x)d'(x). Upon adding a node vv to SS, update d()d'() only for neighbors of vv
  • How do we efficiently compute v=argminxVS(d(x))v = \arg \min_{x \in V - S }(d'(x))
    • Priority Queue!
Insert(Q,s,0)S={s} and d(s)=0while SV(v,d(v))=ExtractMin(Q)Add v to S and set d(v) = d’(v)for every node xVS s.t.(v,x) is an edge in GIf d(v)+l(v,x)<d(x)d(x)d(v)+l(v,x)ChangeKey(Q,x,d(x))\boxed{ \begin{aligned} &Insert(Q, s, 0) \\ &S = \{s\} \text{ and } d(s) = 0 \\ &\text{while } S \neq V \\ &\quad(v, d'(v)) = ExtractMin(Q) \\ &\quad\text{Add v to S and set d(v) = d'(v)}\\ &\quad\text{for every node } x \in V - S \text{ } s.t. (v,x)\text{ is an edge in } G \\ &\qquad\text{If } d(v) + l(v,x) < d'(x) \\ &\qquad\quad d'(x) d(v) + l(v,x) \\ &\qquad\quad ChangeKey(Q,x,d'(x)) \\ \end{aligned} }
  • For each node xVSx \in V - S, store the pair (x,d(x))(x, d'(x)) in a priority queue QQ with d(x)d'(x) as the key
  • Determine the next node vv to add to SS using ExtractMinExtractMin
  • After adding vv to SS, for each node xVSx \in V - S such that there is an edge from vv to xx, check if d(x)d'(x) should be updated i.e. if there is a shortest path from ss to xx via vv
  • in line 8, if xx is not in QQ, simply insert it

New Runtime

  • ExtractMinExtractMin happens n1n - 1 times
  • For every node vv, the running time of step 5 O(degv)O(deg_v), the number of outgoing neighbors of vv: vVO(degv)=O(m)\sum_{v \in V} O(deg_v) = O(m)
  • ChangeKeyChangeKey is invoked at most once for each edge, mm times, and is an O(logn)O(\log n) operation
  • So, the total runtime is O(mlogn)O(m \log n)

Network Design

  • want to connect a set of nodes using a set of edges with certain properties
  • Input is usually a graph, and the desired network (output) should use a subset of edges in the graph
  • Example: connect all nodes using a cycle of shortest total length. This problem is the NP-complete traveling salesman problem

Minimum Spanning Tree (MST)

  • Given an undirected graph G(V,E)G(V,E) with a cost c(e)>0c(e) > 0 associated with each edge eEe \in E.
  • Find a subset TT of edges such that the graph (V,T)(V,T) is connected and the cost eTc(e)\sum_{e \in T} c(e) is as small as possible
  • Instance: An undirected graph G(V,E)G(V,E) and a function c:ER+c: E \rightarrow \R^+
  • Solution: A set TET \sube E of edges such that (V,T)(V, T) is connected and the cost eTc(e)\sum_{e \in T} c(e) is as small as possible
  • Claim: if TT is a minimum-cost solution to this problem, then (V,T)(V,T) is a tree
  • A subset TofET of E is a spanning tree of GG if (V,T)(V, T) is a tree

Characterizing MSTs

  • Does the edge of smallest cost belong to an MST? Yes
    • Wrong proof: Because Kruskal's algorithm adds it.
    • Right proof: tbd
  • Which edges must belong to an MST?
    • What happens when we delete an edge from an MST?
    • MST breaks up into sub-trees
    • Which edge should we add to join them?
  • Which edges cannot belong to an MST?
    • What happens when we add an edge to an MST?
    • We obtain a cycle
    • Which edge in the cycle can we be sure does not belong to an MST

Greedy Algorithm for the MST Problem

  • Template: process edges in some order. Add an edge to TT if tree property is not violated.
    • increasing cost order: Process edges in increasing order of cost. Discard an edge if it creates a cycle – Kruskal's
    • Dijkstra-like: Start from a node ss and grow TT outward from ss: add the node that can attached most cheaply to current tree – Prim's
    • Decreasing cost order: Delete edges in order of decreasing cost as long as graph remains connected – Reverse-Delete
    • each of these works
    • Simplifying assumption: all edge costs are distinct

Graph Cuts

  • a cut in a graph G(V,E)G(V,E) is a set of edges whose removal disconnects the graph (into two or more connected components)
  • Every set SVS \sub V (SS cannot be empty or the entire set VV) has a corresponding cut: cut(SS) is the set of edges (v,w) such that vS,wVSv \in S, w \in V - S
  • cut(SS) is a "cut" because deleting the edges in cut(SS) disconnects SS from VSV - S
  • Claim: for every SV,SS \sub V, S \neq \empty, every MST contains the cheapest edge in cut(S(S)
    • will have to proof by contradiction using exchange argument

Proof of Cut Property of MSTs

  • Negation of the desired property: There is a set SVS \sub V and an MST TT such that TT does not contain the cheapest edge in cut(SS)
  • Proof strategy: If TT does not contain ee, show that there is a tree with a smaller cost than TT that contains ee.
  • Wrong proof:
    • Since TT is spanning, it must contain some edge e.g. ff in cut($S)
    • T{f}{e}T - \{f\} \cup \{e\} has smaller cost than TT but may not be a spanning tree
  • Correct proof:
    • Add ee to TT forming a cycle
    • This cycle must contain an edge ee' in cut(SS)
    • T{e}{e}T - \{e'\} \cup \{e\} has smaller cost than TT and is a spanning tree

Prim's Algorithm

  • Maintain a tree (S, T), i.e., a set of nodes and a set of edges, which we will show will always be a tree
  • Start with an arbitrary node sSs \in S
  • Step 3 is the cheapest edge in cut(S) for the currently explored set S
  • In other words, each step in Prim's algorithm computes and adds the cheapest edge in the current value of cut(SS)
    • argmin(u,v):uS,vVSc(u,v)argmin(u,v)cut(S)c(u,v)\arg \min_{(u,v):u \in S, v \in V - S} c(u,v) \equiv \arg \min_{(u,v) \in cut(S)} c(u,v)
S={s} and T=while SVCompute (u,v)=argmin(u,v):uS,vVSc(u,v)Add the node v to S and add the edge (u,v) to T\boxed{ \begin{aligned} &S = \{s\} \text{ and } T = \empty \\ &\text{while } S \neq V \\ &\quad\text{Compute } (u, v) = \arg \min_{(u,v):u \in S, v \in V - S} c(u,v) \\ &\quad\text{Add the node } v \text{ to } S \text{ and add the edge } (u,v) \text{ to } T \end{aligned} }

Optimality of Prim's

  • Claim: Prim's algorithm outputs an MST
    • Prove that every edge inserted satisfies the cut property (true by construction, in each iteration (u,v)(u, v) is necessarily the cheapest edge in cut(SS) for the current value of SS)
    • Prove that the graph constructed is a spanning tree
      • Why are there no cycles in (V,T)(V, T)
      • Why is (V,T)(V,T) a spanning tree (edges in TT connect all nodes in VV) - Because GG is connected, if there were an unconnected node vv at the end, then Prim's would not have terminated

Final version of Prim's Algorithm

Insert(Q,s,0,)while SV(v,a(v),u)=ExtractMin(Q)Add node v to S and edge (u,v) to Tfor every node xVS s.t. (v,x) is ans edge in G if c(v,x)<a(x)then a(x)=c(v,x)ChangeKey(Q,x,a(x),v)\boxed{ \begin{aligned} &Insert(Q, s, 0, \empty) \\ &\text{while } S \neq V \\ &\quad(v, a(v), u) = ExtractMin(Q) \\ &\quad\text{Add node } v \text{ to } S \text{ and edge } (u,v) \text{ to } T \\ &\quad\text{for every node } x \in V - S \text{ s.t. } (v, x) \text{ is ans edge in } G \\ &\qquad\text{ if } c(v,x) < a(x) \text{then }\\ &\qquad\quad a(x) = c(v,x) \\ &\qquad\quad ChangeKey(Q,x,a(x),v) \\ \end{aligned} }
  • QQ is a priority queue
  • Each element in QQ is a triple, the node, its attachment cost, and its predecessor in the MST
  • In step 8, if xx is not already in QQ, simply insert (x, a(x), v) into QQ
  • Total of n1n - 1 ExtractMinExtractMin and mm ChangeKeyChangeKey operations, yielding a running time of O(mlogn)O(m \log n)
  • running time of step 5 is proportional to the degree of xx which is proportional to the number of edges in the graph mm

Kruskal's Algorithm

  • Start with an empty set TT of edges
  • Process edges in EE in increasing order of cost
  • Add the next edge ee to TT only if adding ee does not create a cycle. Discard ee otherwise
  • Note: at any iteration, TT may contain several connected components and each node in VV is in some component
  • Claim: Kruskal's algorithm outputs an MST
    • For every edge ee added, demonstrate the existence of a set SVS \sub V (and VSV - S) such that ee and SS satisfy the cut property, i.e.e, the cheapest edge in cut(S)cut(S)
      • If e=(u,v)e = (u,v), let SS be the set of nodes connected to uu in the current graph TT
      • Why is ee the cheapest edge in cut(SS) - because we process them in increasing order of cost
    • Prove that the algorithm computes a spanning tree
      • (V,T)(V,T) contains no cycles by construction
      • If (V,T)(V,T) is not connected, then there exists a subset SS of nodes not connected to VSV-S. What is the contradiction?

Implementing Kruskal's Algorithm

  • start with an empty set TT of edges
  • Process edges in EE in increasing order of cost
  • Add the next edge ee to TT only if adding ee does not create a cycle
  • Sorting edges takes O(mlogn)O(m \log n) time
  • Key question: "Does adding e=(u,v)e = (u,v) to TT create a cycle?
    • Maintain set of connected components of TT
    • Find(u)Find(u): return the name of the connected component of TT that uu belongs to
    • Union(A,B)Union(A,B): merge connected components A, B

Analysing Kruskal's Algorithm

  • How many FindFind invocations does Kruskal's need? 2m2m
  • How many UnionUnion invocations? n1n-1
  • Two implementations of UnionFindUnion-Find:
    • Each FindFind take O(1)O(1), kk invocations of UnionUnion takes O(klogk)O(k \log k) time in total – O(m+nlogn)O(m + n \log n)
    • Each FindFind take O(logn)O(\log n ), each invocation of UnionUnion takes O(1)O(1)O(mlogn+n)O(m \log n + n)
  • In general m<nm < n, but in either case, the total running time is O(mlogn)O(m \log n) since we have to spend O(mlogn)O(m \log n) time sorting the edges by increasing cost, regardless of the underlying implementation of the UnionFindUnion-Find data structure

Comments on Union-Find and MST

  • useful to maintain connected components of a graph as edges are added to the graph
  • Data structure does not support edge deletion efficiently
  • Current best algorithm for MST runs in O(mα(m,n))O(m\alpha (m,n)) time (Chazelle 2000) where α(m,n)\alpha(m,n) is a function of m,nm,n and O(m)O(m) randomized time
    • Let logn=\log^* n = the number of times you take logn\log n before you reach 1
    • e.g. log(210)=4\log^*(2^{10}) = 4, log(2210)=5\log^*(2^{2^{10}}) = 5
  • Holy grail: O(m)O(m) deterministic algorithm for MST

Cycle Property

  • When can we be sure that an edge cannot be in any MST
  • Let CC be any cycle in GG and let e=(v,w)e = (v,w) be the most expensive edge in CC
  • Claim: ee does not belong to any MST of G
  • Proof: exchange argument.
    • If a supposed MST TT contains ee, show that there is a tree with smaller cost than TT that does not contain ee

Reverse-Delete Algorithm

¯\(ツ)

Observations about MST algorithms

  • Any algorithm that constructs a spanning tree by including the Cut and Cycle properties (include edges that satisfy the cut P, delete edges that satisfy the cycle property) will be an MST

07 - Applications of Minimum Spanning Trees

Minimum Bottleneck Spanning Tree (MBST)

  • MST minimizes the total cost of the spanning network
  • Consider another network design criterion
    • build a network connecting all cities in mountainous region, but ensure the highest elevation is as low as possible
    • total road length is not a criterion
  • Idea: compute an MST in which the edge with the highest cost is as cheap as possible
  • In an undirected graph G(V,E)G(V,E), let (V,T)(V,T) be a spanning tree. The bottleneck edge in TT is the edge with the largest cost in TT
  • Instance: An an undirected graph G(V,E)G(V,E) such that V,TV,T is a spanning tree
  • Solution: A Set TET \sube E of edges such that (V,T)(V,T) is a spanning tree and there is no spanning tree in GG with a cheaper bottleneck

Is every MST and MBST, or vice versa?

  • If a tree is an MBST, it might not be an MST
  • If a tree is an MST, it will be an MBST

Proof:

  • Let TT be the MST, and let TT' be a spanning tree with a cheaper bottleneck edge.
  • Let ee be the bottleneck edge in TT
  • Every edge in TT' has lower cost than ee
  • Adding ee to TT' creates a cycle consisting only of edges, where ee is the costliest edge in the cycle
  • Since ee is the costliest edge in this cycle, by the cycle property, ee cannot belong to any MST, which contradicts the fact that TT is an MST

Motivation for Clustering

  • Given a set of objects and distances between them
  • objects can be anything
  • Distance function: increase distance corresponds to dissimilarity
  • Goal: group objects into clusters, where each cluster is a group of similar objects

Formalizing the Clustering Problem

  • Let UU be the set of nn objects labels p1,p2,,pnp_1, p_2, \mathellipsis, p_n
  • For every pair pi,pjp_i, p_j, we have a distance d(pi,pj)d(p_i, p_j)
  • We require that d(pi,pi)=0d(p_i, p_i) = 0, d(pi,pj)>0d(p_i, p_j) > 0, d(pi,pj)=d(pj,pi)d(p_i, p_j) = d(p_j, p_i),
  • Given a positive integer kk, a k-clustering of UU is a partition of UU into kk non-empty subsets or clusters C1,C2,...,CkC_1, C_2, ..., C_k
  • that spacing of a clustering is the smallest distance between objects in 2 different subsets:
    • spacing(C1,C2,...,Ck)=min1i,jk,ij,pCi,qCjd(p,q)spacing(C_1, C_2, ..., C_k) = \min_{1 \leq i,j \leq k, i\neq j, p \in Ci, q \in C_j} d(p,q)
    • spacing is the minimum of distance between objects in two different subsets
minimum = inifnity
for every cluster C_i
  for every cluster C_j ≠ i
    for every point p in C_i
      for every point q in C_j
        if d(p,q) < minimum
          minimum = d(p,q)

return minimum
  • Clustering of Maximum Spacing
    • Instance: a Set UU of objects, a distance function d:U×UR+d : U \times U \rightarrow \Reals^+
    • Solution: A k-clustering of UU whose spacing is the largest over all possible k-clusterings
    • O(n2)O(n^2) on nn clusters and then O(i×j)O(i \times j) on points in disparate cluster

Algorithm for Max Spacing

  • intuition: greedily cluster objects in increasing order of distance
  • Let C\mathcal C be the set of nn clusters, with each object in UU in its own cluster
  • Process pairs of objects in increasing order of distance
    • Let (p,q)(p,q) be the next pair with pCpp \in C_p and qCpq \in C_p
    • If CpCqC_p \neq C_q add a new cluster CpCqC_p \cup C_q to C\mathcal{C}, delete Cp,CqC_p, C_q from C\mathcal{C}
  • Stop when there are kk cluster in C\mathcal{C}
  • Same as Kruskal's algorithm, but do not add the last k1k-1 edges in MST
  • Given a clustering C\mathcal{C}, what is spacing(C\mathcal{C})?
    • The length of the next edge that would be added - the cost of the (k1)(k-1)st most expensive edge in the MST. Let this cost be dd^*

Proof of optimal computing

  • Let C\mathcal{C'} be any other cluster (with kk clusters).
  • We will prove that spacing(C\mathcal{C'}) d\leq d^*
    • There must be two objects pi,pjUp_i, p_j \in U in the same cluster CrCC_r \in \mathcal{C} but in different clusters in C\mathcal{C'}: spacing(C\mathcal{C'}) d(pi,qi)\leq d(p_i, q_i)
    • Suppose piCsp_i \in C'_s and pjCtp_j \in C'_t in C\mathcal{C}
    • All edges in the path QQ connecting pipjp_i \rightarrow p_j in the MST have lengths d\leq d^*
    • In particular, there is an object pCsp \in C'_s and an object pCsp' \notin C'_s s.t. pp and pp' are adjacent in QQ
    • d(p,p)d    d(p, p') \leq d^* \implies spacing(C\mathcal{C'}) d(p,p)d\leq d(p, p') \leq d^*

Explanation

  • dd^* is the name we give to the variable denoting the spacing produced by our algorithm$
  • The above proof shows that any other clustering will have a smaller spacing

08 - Divide and Conquer Algorithms

Mergesort

  • Instance: nonempty list L=x1,x2,...,xnL = x_1, x_2, ..., x_n integers
  • Solution: A permutation y1,y2,...,yny_1, y_2, ..., y_n of LL such that yiyi+1y_i \leq y_{i+1} for all 1in1 \leq i \leq n
  • Mergesort is a divide and conquer algorithm for sorting
    • Partition LL into two lists A,BA, B of sizes n/2,n/2\lfloor n/2 \rfloor, \lceil n/2 \rceil
    • Recursively sort A,BA, B,
    • Merge the sorted lists A,BA,B into a sorted list

Merging Two Sorted Lists

  • Maintain a current pointer for each list
  • initialize each pointed to the front of the list
  • While both lists are non-empty:
    • let ai,bia_i, b_i be the elements pointed to by each pointer
    • append the smaller of the two to the output list
    • advance the current pointer in the list that the smaller element belonged to
  • append the rest of the non-empty list to the output
  • Running time of this algorithm is O(k+l)O(k+l), k=len(L)k = len(L)

Analysing Mergesort

  • Running time for LL = running time for A+A + running time for B+B + time to split the input into two lists ++ time to merge two sorted lists
    • what is this in the worst case?
    • \leq worst case running time for n/2+n/2+\lfloor n/2 \rfloor + \lceil n/2 \rceil + time to split input into two lists + time to merge two sorted lists
  • This sucks to read, need a shorthand. Let's assume nn is a power of 2
  • Let T(n)T(n) be the worst-case running time for nn elements, n1\forall n \geq 1
  • Worst-case running time:
    • T(n)2T(n/2)+cn,n>2T(n) \leq 2T(n/2) + cn, n > 2, cc is some constant time to split the list
    • T(2)cT(2) \leq c
    • Can assume T(n)=O(nlogn)T(n) = O(n \log n) for problem sets

Analysing Mergesort in the Worst Case

  • Partition LL into two lists AA and BB of size n/2,n/2\lfloor n/2 \rfloor, \lceil n/2 \rceil respectively
  • Recursively sort AA and BB
  • Merge the sorted lists A,BA, B into a single sorted list
  • Worst case running time for nn elements ≤
    • worst case running time for n/2\lfloor n/2 \rfloor +
    • worst case running time for n/2\lceil n/2 \rceil +
    • time to split the input into two lists +
    • time to merge two sorted lists
  • Assuming nn is a power of 2: T(n)2t(n/2)+cn,n>2T(n) \leq 2t(n/2) + cn, \quad n > 2
  • Three basic ways of solving this recurrence relation:
    • "Unrolling" the recurrence
    • Guess a solution and substitute into recurrence to check
    • Guess a solution in O()O( ) form and substitute into recurrence to determine the constants

Unrolling

  • Input to each sub problem on level ii has size n/2in/2^i
  • Recursion tree has logn\log n levels
  • Number of sub-problems on level ii has size 2i2^i
  • Total work done at each level is cncn
  • Running time of the algorithm is cnlogncn \log n

Substituting a Solution into the Recurrence

  • Guess that the solution is T(n)cnlognT(n) \leq cn \log n (log base 2)

  • Use induction to check if the solution satisfies the recurrence relation

  • Base case: n=2n = 2. Is T(2)=c2clog2T(2) = c \leq 2c \log 2? yes.

  • (strong) Inductive Hypothesis: must include n/2n/2

    • Assume T(m)cmlog2mT(m) \leq cm \log_2 m, for all 2m<n2 \leq m < n, therefore
    • T(n2)cn2log(n2)T({n \over 2}) \leq {cn \over 2} \log ({n \over 2})
  • Inductive Step: Prove T(n)cnlognT(n) \leq cn \log n

    T(n)2T(n2)+cn2(cn2log(n2))=cnlog(n2)+cn=cnlogncn+cn=cnlogn\begin{aligned} T(n) &\leq 2T({n \over 2}) + cn \\ &\leq 2({cn \over 2} \log ({n \over 2})) \\ &= cn \log ({n \over 2}) + cn \\ &= cn \log n - cn + cn \\ &= cn \log n \\ \end{aligned}
  • Why is T(n)kn2T(n) \leq kn^2 a "loose" bound?

  • Why doesn't an attempt to prove T(n)knT(n) \leq kn, for some k>0k > 0 work?

    • (strong) Inductive Hypothesis: must include n/2n/2
      • Assume T(m)kmlogmT(m) \leq km \log m, for all m<nm < n, therefore
      • T(n2)cn2log(n2)T({n \over 2}) \leq {cn \over 2} \log ({n \over 2})
    • Inductive Step: Prove T(n)knlognT(n) \leq kn \log n
    T(n)2T(n2)+cn2(km2log(m2)+cnk(n2)2+cnknlognkn+cnkn22+cncnlogn if kc),kn2\begin{aligned} T(n) &\leq 2T({n \over 2}) + cn \\ &\leq 2({km \over 2} \log ({m \over 2}) + cn \leq k(\frac{n}{2})^2 + cn\\ &\leq kn \log n - kn + cn \leq \frac{kn^2}{2} + cn\\ &\leq cn \log n \text{ if } k \geq c), \leq kn^2 \\ \end{aligned}

    what about T(n)knT(n) \leq kn: it breaks down (I guess?)

    T(n)2T(n2)+cn2(kn2)+cnknlognkn+cnkn22+cnkn+cn=(k+c)nkn\begin{aligned} T(n) &\leq 2T({n \over 2}) + cn \\ &\leq 2({kn \over 2}) + cn\\ &\leq kn \log n - kn + cn \leq \frac{kn^2}{2} + cn\\ &\leq kn + cn = (k +c)n \\ &\leq kn \\ \end{aligned}

    Partial Substitution

  • Guess that the solution is knlog2nkn \log_2 n

  • Substitute guess into the recurrence relation to check what value of kk will satisfy the recurrence relation

  • kck \geq c will work

Proof for all Value of nn

  • We assumed that nn is a power of 2
  • How do we generalize the proof?
  • Basic axiom: T(n)T(n+1)T(n) \leq T(n+1), for all nn: worst case running time increase as input size increases
  • Let mm be the smallest power of 2 larger than nn
  • T(n)t(m)=O(mlogm)=T(n) \leq t(m) = O(m \log m) = O(n \log n)becausebecausem \leq 2n$

Other Recurrence Relations

  • Divide into qq sub-problems of size n/sn/s and merge in O(n)O(n) time

  • two distinct cases: q=1,q>2q = 1, q > 2

  • Divide into two subproblems of size n/2n/2 and merge in O(n2)O(n^2) time

  • Consider: T(n)=qT(n/2)+cn,q=1T(n) = q T(n/2) + cn, q = 1 (finding the median in a set of data)

    • Each invocation reduces the problem by a factor of 2 -> there are logn\log n levels in the recursion tree
    • At level ii of the tree, the problem size is n/2in/2^i and the work done is cn/2icn/2^i
    • Therefore the total work done is

    i=0i=logncn2i=2=O(n)\displaystyle\sum^{i = \log n}_{i=0} \frac{cn}{2^i} = 2 = O(n)

    since

    S=1+12+14+18+...2S=2+1+12+14+18+...2SS=2\begin{aligned} S &= 1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + ... \\ 2S &= 2 + 1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + ... \\ 2S-S &= 2 \end{aligned}
  • What about q>2q > 2: T(n)=qT(n/2)+cnT(n) = qT(n/2) + cn?

  • nlognn \log n levels in the recursion tree

  • At level ii of the tree, there are qiq^i sub problems, each of size n/sin/s^i

  • Total work done at level ii is qicn/2iq^icn/2^i, therefore the total work is

    T(n)i=0i=log2nqicn2ii=0i=log2n(q2)iO(cn(q2)log2n)=O(cn(q2)(logq/2n)(log2q/2))O(cn nlog2q/2)=O(nlog2q)\begin{aligned} T(n) &\leq \displaystyle\sum^{i = \log_2 n}_{i=0} q^i \frac{cn}{2^i} \leq \displaystyle\sum^{i = \log_2 n}_{i=0} (\frac{q}{2})^i \\ &\leq O \Big( cn (\frac{q}{2})^{\log_2 n} \Big) = O \Big(cn (\frac{q}{2})^{(log_{q/2} n)(\log_2 q/2)} \Big) \\ &\leq O(cn \text{ } n^{\log_2 q/2}) = O(n^{\log_2 q}) \end{aligned}

    Counting Inversions

  • Motivation:

    • collaborative filtering - match one user's preferences to those of other users
    • Meta-search engines - merge results of multiple search engines into a better search results
  • Two rankings of things are very similar if they have few inversions

    • Two rankings a,ba,b have an inversion if there exists i,ji,j such that i<ji < j but ai>aja_i > a_j
    • the number of inversions ss is a measure of the difference between the rankings
  • Kendall's rank correlation of two lists of numbers is 12sn(n1)\frac{1-2s}{n(n-1)}

  • Instance: A list L=x1,xs,...,xnL = x_1, x_s, ..., x_n of distinct integers between 1,n1, n

  • Solution: The number of pairs (i,j),1<i<jn(i,j), 1 < i < j \leq n s.t. xi>xjx_i > x_j

  • How many inversions can there be in a list of nn numbers?

    • n(n1)2=Ω(n2)\frac{n(n-1)}{2} = \Omega(n^2) since every inversion involves a pair of two numbers
  • Sorting removes all inversion in O(nlogn)O(n \log n) time. Can we modify mergesort to count inversions?

  • Candidate algorithm:

    • Partition LL into A,BA,B of size n/2n/2 each
    • Recursively count the number of inversions in AA and sort AA
    • Recursively count the number of inversions in BB and sort BB
    • Count to number of inversions involving on element in AA and one element in BB

"Conquer" step

  • Given lists A=a1,a2,...,amA = a_1, a_2, ..., a_m and B=b1,b2,...,bmB = b_1, b_2, ..., b_m, compute the number of pairs aia_i and bjb_j s.t. ai>bja_i > b_j
    • This step is much easier if A,BA,B are sorted
  • Merge-And-Count Procedure:
    • Maintain a current pointer for each list
    • Maintain a variable count initialized to 0
    • Initialize each pointer to the front of each list
    • While both lists are non-empty
      • Let ai,bja_i, b_j be the elements pointed to by the current pointers
      • Append the smaller of the two to the output list
      • Do something clever in O(1)O(1) time --> if bj<aib_j < a_i then count += m - i, since every element following ai+a_{i^+} will also be an inversion with bjb_j
      • Advance the current pointer of the list containing the smaller list
    • Append the rest of the non-empty list to the output
    • Return count and the merged list
    • O(m)O(m)

Sort and Count

If the list has one elementThere are no inversions and the list is sorted Divide the list into two halvesAlistn/2Blistn/2\boxed{ \begin{aligned} &\text{If the list has one element} \\ &\quad\text{There are no inversions and the list is sorted } \\ &\text{Divide the list into two halves} \\ &\quad A \leftarrow list \lfloor n/2 \rfloor \\ &\quad B \leftarrow list \lceil n/2 \rceil \\ \end{aligned} }

Proof of Correctness

  • Proof by induction: Strategy
    • a) every inversion in the data is counted exactly once
    • b) no non-inversions are counted
  • Base case: n=1n = 1
  • Inductive Hypothesis: Algorithm counts number of inversion correctly for all sets of n1n-1 or fewer numbers
  • Inductive Step: Consider an arbitrary inversion, i.e., any pair k,lk,l such that k<lk < l but xk>xlx_k > x_l. When is this inversion counted by the algorithm
    • k,ln/2:xk,xlAk,l \leq \lfloor n/2 \rfloor: x_k, x_l \in A, counted in rAr_A but the inductive hypothesis
    • k,ln/2:xk,xlBk,l \geq \lceil n/2 \rceil: x_k, x_l \in B, counted in rBr_B but the inductive hypothesis
    • kn/2,ln/2:xkA,xlBk \leq \lfloor n/2 \rfloor, l \geq \lceil n/2 \rceil: x_k \in A, x_l \in B. Is this inversion counted by Merge-And-Count? Yes: when xlx_l is output to the final list is when we count this inversion
    • Why is no non-inversion counted? When xlx_l is output, it is smaller than all remaining elements in AA since AA is sorted

09 - Dynamic Programming

Motivation

  • Goal: design efficient polynomial time algs
  • Greedy:
    • pro: natural approach to algorithm design
    • con: many greedy approaches; only some may work
    • con: many problems for which no greedy approach is known
  • Divide and conquer
    • pro: simple to develop algorithm skeleton
    • con: conquer step can be very hard to implement efficiently
    • con: usually reduces time for a problem known to be solvable in polynomial time
  • Dynamic Programming
    • more powerful than greedy and divide and conquer strategies
    • implicitly explore the space of all possible solutions
    • solve multiple sub-problems and build up correct solutions to larger and larger subproblems
    • careful analysis needed to ensure number of sub-problems solved is polynomial in the size of the input

Example: Greedy Scheduling, but using a DP approach

  • Input: start and end time of day: Set {(s(i),f(i)),1in}\{(s(i), f(i)), 1 \leq i \leq n\}
  • Constraint: cannot be in two places at one time
  • Goal: maximum number of rides you can go on: largest subset of mutually compatible jobs
  • EFT: sort jobs in order of increasing finish time, and add compatible jobs to the set
  • Use DP to solve a Weighted Interval Scheduling problem:
    • Instance: nonempty set {(sifi),1in}\{(s_i f_i), 1 \leq i \leq n\} and a weight vi0v_i \geq 0 associated with each job
    • Solution: A set SS of mutually compatible jobs such that the value iSvi\sum_{i \in S} v_i is maximized
    • Greedy algorithm can produce arbitrarily bad results for this problem depending on weights

Detour: Binomial Identity

      1
    1   1
  1   2   1
1   3   3   1
  • sum of a row = 2n2^n
  • Value of a cell kk in row nn = (nk)n \choose k
  • each element is the sum of the two elements above it
  • (nr)=(n1r1)+(n1r){n \choose r} = {n-1 \choose r-1} + {n-1 \choose r}

Approach

  • Sort jobs in increasing order of finish time and relabel: f1f2...fnf_1 \leq f_2 \leq ... \leq f_n
  • Job ii comes before job jj if i<ji < j
  • p(j)p(j) is the largest index i<ji < j such that job ii is compatible with job jj
    • p(j)=0p(j) = 0 if there is no such job ii
    • All jobs that come before p(j)p(j) are also compatible with job jj

Sub-problems

  • Let OO be the optimal solution: it contains a subset of the input jobs. Two cases to consider:
    • Case 1: Job nn is not in OO.
      • OO must be the optimal solution for jobs {1,2,...,n1}\{1, 2, ..., n - 1\}
    • Case 2: Job nn is in OO
      • OO cannot use incompatible jobs {p(n)+1,p(n)+2,...,n1}\{p(n) + 1, p(n) + 2, ..., n - 1\}
      • Remaining jobs in OO must be the optimal solution for jobs {1,2,...,p(n)}\{1, 2, ..., p(n)\}
    • One of the two cases must be true: OO is the best of the two choices
    • Suggests finding optimal solution for sub-problems consisting of jobs {1,2,...,j1,j}\{1, 2, ..., j - 1, j\}, for all values of jj

Recursion

  • Let OjO_j be the optimal solution for jobs {1,2,...,j}\{1, 2, ..., j\} (a set of jobs) and OPT(j)OPT(j) be the value of this solution: OPT(0)=0OPT(0) = 0
  • We are sooking OnO_n with a value of OPT(n)OPT(n), but we start with $j4
  • To compute OPT(j)OPT(j):
    • Case 1: jOjj \notin O_j: OPT(j)=OPT(j1)OPT(j) = OPT(j-1)
    • Case 2: jOjj \in O_j: OPT(j)=vj+OPT(p(j))OPT(j) = v_j + OPT(p(j)) // p(j)<jp(j) < j
    • OPT(j)=max(vj+OPT(p(j)),OPT(j1))OPT(j) = \max (v_j + OPT(p(j)), OPT(j - 1))
  • When does job jj belong to OjO_j?
    • Iff OPT(j)=vj+OPT(p(j))OPT(j) = v_j + OPT(p(j))
Compute OPT(j)If j=0Return 0ElseReturn max(vj+OPT(p(j)),OPT(j1))\boxed{ \begin{aligned} &\text{Compute } OPT(j) \\ &\text{If } j = 0 \\ &\quad\text{Return } 0 \\ &\text{Else} \\ &\quad\text{Return } \max (v_j + OPT(p(j)), OPT(j-1)) \\ \end{aligned} }

Running Time of Recusrive Algorithm

  • It can be exponential in nn when p(j)=j2,j2p(j) = j - 2 , \quad\forall j \geq 2: recursive calls are for j1,j2j-1, j-2
  • Slide
  • However, the same sub-problems get used over and over again, we can cache them

Memoisation

  • Store OPT(j)OPT(j) values in a cache using m_compute_opt(j) and reuse them rather than compute them:
M-Compute-OPT(j)If j=0Return 0Else if M[j] is not emptyReturn M[j]ElseM[j]max(vj+M-Compute-OPT(j),M-Compute-OPT(j1))Return M[j]\boxed{ \begin{aligned} &\texttt{M-Compute-OPT}(j) \\ &\text{If } j = 0 \\ &\quad\text{Return } 0 \\ &\text{Else if } M[j] \text{ is not empty} \\ &\quad \text{Return } M[j] \\ &\text{Else} \\ M[j] \leftarrow \max (v_j + &\texttt{M-Compute-OPT}(j),\texttt{M-Compute-OPT}(j-1)) \\ &\quad\text{Return } M[j] \\ \end{aligned} }
  • Claim: running time of algorithm is O(n)O(n) after sorting
  • Time spent in a single call to M-Compute-Opt is O(1)O(1) apart from time spent in recursive calls
  • Total time spent is the order of the number of recursive calls to M-Compute-Opt
  • How many such recursive calls are there in total?
  • Use number of filled entries in MM as a measure of progress
  • Each time M-Compute-Opt issues two recursive calls, it fills in a new entry in MM
  • Therefore, total number of recursive calls is O(n)O(n)

Computing OO in addition to OPT(n)OPT(n)

  • explicitly soted OjO_j in addition to OPT(j)OPT(j)
  • Recall: request jj belong to OjO_j iff vj+OPT(p(j))OPT(j1)v_j + OPT(p(j)) \geq OPT(j -1)
  • Can recover OjO_j from values of optimal solutions in O(j)O(j) time
Find-Solution(j)If j=0Return nothingElse If vj+M[p(j)]M[j1] then Return j,FindSolution(p(j))Else Return FindSolution(j1)\boxed{ \begin{aligned} &\texttt{Find-Solution}(j) \\ &\text{If } j = 0 \\ &\quad\text{Return nothing} \\ &\text{Else } \\ &\quad \text{If } v_j + M[p(j)] \geq M[j - 1] \text{ then } \\ &\quad\text{Return } j, Find-Solution(p(j)) \\ &\quad\text{Else } \\ &\quad\text{Return } Find-Solution(j-1) \\ \end{aligned} }

From Recursion to Iteration

  • Unwind the recursion and convert it into iteration
  • Can compute values of MM iteratively in O(n)O(n) time
  • Find solution works as before

Basic Outline of Dynamic Programming

  • need a collection of sub-problems that satisfy a few properties
  • There are a polynomial number of sub problems
  • The solution to the problem can by computed easily from the solutions to the sub problems
  • There is a natural ordering from "smallest" to "largest"
  • There is an easy to compute recurrence that allows us to compute the solution to the sub-problems from solutions to some smaller sub problems

Least Squares Problem

  • Given a set of scientific or statistical data plotted on two axes, find the best line that passes through these points
  • Instance: Set P={(x1,y1),(x2,y2),...,(xn,yn)}P = \{(x_1, y_1), (x_2, y_2), ... , (x_n, y_n)\} of nn points
  • Solution: Line L:y=ax+bL: y = ax + b that minimizes Error(L,P) =_i=1n(yiaxib)2 = \displaystyle\sum\_{i=1}^n(y_i - ax_i - b)^2
  • How many unknown parameters must we find values for?
    • a,ba,b via Lagrangian
    • a=nxiyi(xi)(yi)nxi2(xi)2a = \frac{n\sum x_i y_i - (\sum x_i)(\sum y_i)}{n \sum x_i^2 - (\sum x_i)^2} and b=yiaxinb = \frac{\sum y_i - a \sum x_i}{n}

Segmented Least Squares

  • Want to fit multiple lines through PP

  • Each line must fit contiguous set of x-coordinates

  • Lines must minimize total error

  • Divide the points into segments; each segment contains a consecutive set of points ordered by increasing x-coordinate

  • Instance: Set P={pi=(xi,yi),1in}P = \{p_i = (x_i, y_i), 1 \leq i \leq n\} of nn points, x1<x2<...<xnx_1 < x_2 < ... < x_n and a parameter C>0C > 0

  • Solution

    • an integer kk
    • a partition of PP into kk segments {P1,P2,...,Pk}\{P_1, P_2, ..., P_k\}
    • for each segment PjP_j, the best fit line Lj:y=ajx+bj,1jkL_j: y = a_jx + b_j, 1 \leq j \leq k that minimises the total error j=1kError(Lj,Pj)+Ck\displaystyle\sum_{j=1}^k Error(L_j, P_j) + Ck
  • if CC is very large then one line works, if CC is small, then we fit many lines

  • How many unknown parameters here? 2k,k2k, k since we need to find the slope and intercept of kk lines as well a the optimal value of kk itself

Formulating the Recursion

  • Let eje_j denote the minimum error of a (single) line that fits {p1,p2,...,pi}\{p_1, p_2, ..., p_i\}

  • Let OPT(i)OPT(i) be the optimal total error for the points {p1,p2,...,pi}\{p_1, p_2, ..., p_i\}

  • We want to compute OPT(n)OPT(n)

  • Observation: Where does the last segment in the optimal solution end? pnp_n, and this segment starts at some point pip_i. So the rest of the solution must be the optimal fit for points up to pip_i: OPT(i1)OPT(i-1)

  • OPT(n)OPT(n) is the error for the line that goes through points i,ni, n plus the penalty for adding a line CC plus the best solution for the last i1i-1 points

  • If the last segment in the optimal solution is {pi,pi+1,...,pn}\{p_i, p_{i+1}, ..., p_n\}, then

    OPT(n)=ei,n+C+OPT(i1)OPT(n) = e_i,n + C + OPT(i-1)

  • Suppose we want to solve the sub-problem on the first jj points: {p1,p2,...,pj}\{p_1, p_2, ..., p_j\}: we want to compute OPT(j)OPT(j)

  • If the last swgment in the optimal partition is {pi,pi+1,...,pj}\{p_i, p_{i+1}, ..., p_j\} then

    OPT(j)=ei,j+C+OPT(i1)OPT(j) = e_i,j + C + OPT(i-1)

  • But ii can take only jj distinct values: 1,2,3,...,j1, 2, 3, ..., j therefore:

    OPT(j)=min1ij(ei,j+C+OPT(i1))OPT(j) = \min \limits_{1 \leq i \leq j} (e_i,j + C + OPT(i-1))

  • Segment {pi,pi+1,...,pj}ispartoftheoptimalsolutionforthissubproblemiffthevaluesof\{p_i, p_{i+1}, ..., p_j\} is part of the optimal solution for this sub-problem iff the values of OPT(j) is obtained using index ii

DP Algorithm

Segmented-Least-Squares(n)Array M[0,...,n]Set M[0]=0For all pairs ijCompute the least squares error ei,j for the segment pi,...,pjForj=1,2,...,nUse the recurrence from (6.7) to compute M[j]Return M[n]\boxed{ \begin{aligned} &\texttt{Segmented-Least-Squares}(n) \\ &\text{Array } M[0, ..., n] \\ &\text{Set } M[0] = 0 \\ &\text{For all pairs } i \leq j \\ &\quad\text{Compute the least squares error } e_{i,j} \text{ for the segment } p_i, ..., p_j\\ &\text{For} j = 1,2, ..., n \\ &\quad\text{Use the recurrence from (6.7) to compute } M[j] \\ &\text{Return } M[n] \end{aligned} }
  • Computing OPT(j)OPT(j) takes O(j)O(j)
  • j=1nO(j)=O(n2)\sum_{j=1}^n O(j) = O(n^2)
1ijO(ji)=j(j1)/2j2/2j2\begin{aligned} \displaystyle\sum_{1 \leq i \leq j} O(j - i) &= j(j-1)/2 \\ &\leq \sum j^2/2 \\ &\leq \sum j^2 \end{aligned}
  • Let T(n)T(n) be the running time of this algorithm (jij - i since we compute segments)

    T(n)=ijn1ijO(ji)=O(n3)T(n) = \displaystyle\sum_{i \leq j \leq n}\sum_{1 \leq i \leq j} O(j - i) = O(n^3)


RNA Molecules

  • basic biological molecule. Single stranded
  • RNA molecules fold into complex "secondary structures"
  • Secondary structure governs the behavior of an RNA molecule
  • carious rules govern secondary structure formation
  • Pairs of bases match up; each base matches with ≤ 1 other base
  • Adenine : Uracil
  • Cytosine : Guanine
  • There are no kinks in the folded molecule
  • Structures are "knot-free"
  • High level problem: given an RNA molecule, predict its secondary structure

Formulating the Problem

  • an RNA Molecule is a string B=b1b2...bnB = b_1 b_2 ... b_n; each bi{A,C,G,U}b_i \in \{A, C, G, U\}
  • A secondary structure on BB is a set of pairs S={(i,j)}S = \{(i,j)\} where 1i,jn1 \leq i,j \leq n and
    • No kinks: if (i,j)S(i,j) \in S then i<j4i < j - 4
    • Watson Crick: The elements in each pair of SS consist of either {A,U}\{A, U\} or {c,G}\{c, G\}
    • SS is a matching: no index appears in more than one pair
    • No knots: If (i,j)(i, j) and (k,l)(k, l) are two pairs in SS, then we cannot have i<k<j<li < k < j < l
  • The Energy of a secondary structure \propto the number of base pairs in it.
  • Problem: Compute the largest secondary structure, i.e. with the largest number of base pairs

Dynamic Programming Approach

  • OPT(j)OPT(j) is the maximum number of base pairs in a secondary structure for b1b2...bjb_1b_2...b_j, OP(j)=0,j5OP(j) = 0, j \leq 5
  • In the optimal secondary structure on b1b2...bjb_1b_2...b_j
    • If jj is not a member of any pair, use OPT(j1)OPT(j-1)
    • If jj pairs with some t<j4t < j - 4, then the knot condition yields tow independent sub-problems: OPT(t1)OPT(t-1), ???
      • Note that we need that sub-problems to be indexed by both start and by end
  • OPT(i,j)OPT(i,j) is the maximum number of base pairs in a secondary structure for bibi+1...bjb_ib_{i+1}...b_j, OPT(i,j)=0,ij4OPT(i,j) = 0, i \geq j-4
  • In the optimal secondary structure on bibi+2...bjb_ib_{i+2}...b_j
    • If jj is not a member of any pair, compute OPT(i,j1)OPT(i, j-1)
    • If jj pairs with somme t<j4t < j -4, compute OPT(i,t1)OPT(i, t-1) and OPT(t+1,j1)OPT(t+1, j-1)

OPT(i,j)=max(OPT(i,j1),maxt(1+OPT(i,t1)+OPT(t+1,j1)))OPT(i,j) = \max \Big( OPT(i, j-1), \max \limits_t (1 + OPT(i, t-1) + OPT(t+1, j-1)) \Big)

  • In the "inner" maximization, tt runs over all indices between i,j5i,j-5 that are allowed to pair with jj
  • There are O(n2)O(n^2) sub-problems: we have a matrix of i,ji,j with the lower right triangle filled out: j=5ni=1j=n3\sum_{j=5}^n \sum_{i=1}^j = n^3 since the inner summation is less than j(j1)/2j(j-1)/2 which we then sum another nn times
  • How do we order them from "smallest" to "largest"? increasing order of jij-i
  • Note that computing OPT(i,j)OPT(i,j)? O(ij)O(i-j) : involves sub-problems OPT(l,m)OPT(l,m) where ml<jim - l < j - i
Initialize OPT(i,j) = 0 whenever ij4For k=5,6,...,n1For i=1,2,...,nkSet j=i+kCompute OPT(i,j) using the recurrence above Return OPT(1,n)\boxed{ \begin{aligned} &\text{Initialize OPT(i,j) = 0 whenever } i \geq j -4\\ &\text{For } k = 5,6,...,n-1 \\ &\quad\text{For } i = 1,2,...,n-k \\ &\qquad\text{Set } j = i + k \\ &\qquad\text{Compute } OPT(i,j) \text{ using the recurrence above }\\ &\text{Return } OPT(1,n) \end{aligned} }
  • Total running time for the algorithm is O(n3)O(n^3)

Dynamic Programming for Shortest Path

Motivation

  • Computational finance
    • Each node is a financial agent
    • The cost cuvc_{uv} of an edge (u,v)(u,v) is the cost of a transaction in which we buy from agent uu and sell to agent vv
    • Negative cost corresponds to profit
  • Internet routing protocols
    • Dijkstra's algorithm needs knowledge of the entire network
    • Routers only know which other routers they are connected to
    • Algorithm for shortest paths with negative edges is decentralized
    • Not covered in class, but described in Chapter 6.9

Problem Statement

  • Input: a directed graph G=(V,E)G = (V,E) with a cost function c:ERc: E \rightarrow \reals i.e., cuvc_{uv} is the cost of the edge (u,v)E(u,v) \in E
  • A negative cycle is a directed graph whose edges have a total cost that is negative
  • Two related problems:
    • If GG has no negative cycles, find the shortest s-t path: a path from source ss to destination tt with minimum total cost
    • Does GG have a negative cycle?

Approaches for Shortest Path Algorithm

  1. Dijkstra's
  • Computes incorrect answers because it is greedy
  1. Add some largest constant to each edge
  • Computes incorrect answers because the minimum cost path changes

DP Approach

  • Assume GG has no negative cycles
  • Claim: there is a shortest path from ss to tt that is simple (does not repeat a node) and therefore has at most n1n-1 edges
  • How do we define sub-problems?
    • Proof: The portion of the sts \rightarrow t path between the two appearances of the node uu must have a positive cost
    • Shortest sts \rightarrow t path has n1\leq n -1 edges: how can we reach tt using ii edges, for different values of ii?
    • We do not know which nodes will be in shortest sts \rightarrow t path: how we can reach tt from each node in VV
  • Sub-problems defined by varying the number of edges in the shortest path and by varying the starting node in the shortest path

DP Recursion

  • OPT(i,v)OPT(i,v): minimum cost of a vtv \rightarrow t path that uses at most ii edges
  • tt is not explicitly mentioned in the sub-problems
  • Goal is to compute OPT(n1,s)OPT(n - 1, s)
  • Let PP be the optimal path whose cost is OPT(i,v)OPT(i,v)
    • If PP actually uses i1i -1 edges, then OPT(i,v)=OPT(iv)OPT(i,v) = OPT(i - v)
    • If first node on PP is ww, then OPT(i,v)=cvw+OPT(i1,wOPT(i,v) = c_{vw} + OPT(i -1, w
OPT(0,v)={0;v=t;vt OPT(0, v) = \begin{cases} 0; v = t \\ \infty; v \neq t \end{cases}

OPT(i,v)=min(OPT(iv),minwV(cvw+OPT(i1,w)))OPT(i,v) = \min \Big(OPT(i-v), \min \limits_{w \in V} (c_{vw} + OPT(i -1, w)) \Big)

Alternate DP Formulation

  • OPT(i,v)OPT(i,v): minimum cost of a vtv \rightarrow t path that uses exactly ii edges
  • a bit harder, but it's in the slides

Bellman-Ford Algorithm

Shortest-Path(G,s,t)n=VGm[0,...,n1,V]DefineM[0,t]=0,M[0,v]= for all other vVFor i=1,2,...,n1For eachvV (in any order)ComputeM[i,v] using the recurrenceReturn M[n1,s]\boxed{ \begin{aligned} &\texttt{Shortest-Path}(G,s,t)\\ &n = |V| \in G \\ &m[0, ... , n-1, V] \\ &\text{Define} M[0,t]=0, M[0,v] = \infty \text{ for all other } v \in V \\ &\text{For } i = 1,2, ..., n -1 \\ &\quad\text{For each} v \in V \text{ (in any order)} \\ &\qquad\text{Compute} M[i,v] \text{ using the recurrence} \\ &\text{Return } M[n - 1, s] \end{aligned} }
  • space used is O(n2)O(n^2), running time is O(n3)O(n^3)
  • If shortest path uses kk edges, we can recover it in O(kn)O(kn) time by tracing back through smaller sub-problems

An Improved Bound on the Running Time

  • Suppose GG has nn nodes and m(n2)m \ll {n \choose 2} edges. Can we demonstrate a better upper bound on the running time

M[i,v]=min(M[i1,v],minwNv(cwv+M[i1,w]))M[i,v] = \min \Big( M[i-1, v], \min \limits_{w \in N_v} (c_{wv} + M[i-1, w]) \Big)

  • ww only needs to range over outgoing neighbors NvN_v of vv
  • if nv=Nvn_v = |N_v| is the number of outgoing neighbors of vv, then in each round, we spend time equal to

vV=m\displaystyle\sum_{v \in V} = m

  • So the total running time can be improved to O(mn)O(mn)

Improving the Memory Requirements

M[i,v]=min(M[i1,v],minwNv(cwv+M[i1,w]))M[i,v] = \min \Big( M[i-1, v], \min \limits_{w \in N_v} (c_{wv} + M[i-1, w]) \Big)

  • Algorithm still uses O(n2)O(n^2) space to store the 2D array MM
  • However, M[i,v]M[i,v] depends only on M[i1,]M[i-1, *] and no other indices
  • Modified algorithm:
    • Maintain two arrays MM and MM' indexed over VV
    • At the beginning of each iteration, copy MM into MM'
    • To update MM, we use

M[v]=min(M[v],minwNv(cwv+M[w]))M[v] = \min \Big( M'[v], \min \limits_{w \in N_v} (c_{wv} + M'[w]) \Big)

  • claim: at the beginning of iteration ii, MM stores values of OPT(i1,v)OPT(i - 1, v) for all nodes vVv \in V
  • space used is O(n)O(n)

Computing the Shortest Path: Algorithm

M[v]=min(M[v],minwNv(cwv+M[w]))M[v] = \min \Big( M'[v], \min \limits_{w \in N_v} (c_{wv} + M'[w]) \Big)

  • How can we recover the shortest path that has cose M[v]M[v]?
  • For each node vv, compute and update f(v)f(v), the first node after vv in the current shortest path from vv to tt
  • Updating f(v)f(v): If xx is the node that attains the minimum in minwNv(Cvw+M[w])\min \limits_{w \in N_v} (C_{vw} + M'[w]) and M[v]>cvx+M[x]M'[v] > c_{vx} + M'[x], set
    • M[v]=cvx+M[x]M[v] = c_{vx} + M'[x]
    • f(v)=xf(v) = x
  • At the end, follow f(v)f(v) points from sts \rightarrow t (and hope for the best lol??)

Computing the Shortest Path: Correctness

  • Pointer graph P(V,F)P(V, F): each edge in FF is (v,f(v))(v, f(v))
    • Can PP have cycles?
    • is there a path from sts \rightarrow t in PP?
    • Can there be multiple paths sts \rightarrow t in PP?
    • Which of these is the shortest path?

Computing the Shortest Path: Cycles in PP

M[v]=min(M[v],minwNv(cvw+M[w]))M[v] = \min \Big(M'[v], \min \limits_{w \in N_v} (c_{vw} + M'[w])\Big)

  • Claim: If PP has a cycle CC, then CC has negative cost.
    • Suppose we set f(v)=wf(v) = w, at this instant M[v]=cwv+M[w]M[v] = c_{wv} + M'[w]
    • Comparing M[w]M[w] and M[w]M'[w], we know that M[w]M[w]M[w] \leq M'[w]
    • Between this assignment and the assignment of f(v)f(v) to some other node, M[w]M[w] may itself decrease. Hence, M[v]cwv+M[v]M[v] \geq c_{wv} + M[v], in general
    • Let v1,v2,...,vkv_1, v_2, ..., v_k be the nodes in CC and assume that (vk,v1)(v_k, v_1) is the last edge to have been added.
    • What is the situation just before this addition?
      • M[vi]M[vi+1]cvivi+1M[v_i] - M[v_{i+1}] \geq c_{v_iv_{i+1}} for all 1i<k11 \leq i < k -1, f(vi)=vi+1f(v_i) = v_{i+1}
      • M[vk]M[v1]>cvkv1M[v_k] - M[v_1] > c_{v_kv_1}
      • Adding all these inequalities: 0>i=1k1cvivi+1+cvkv1=0 > \sum_{i=1}^{k -1} c_{v_iv_{i+1}} + c_{v_kv_1} = cost of CC
    • Corollary: if GG has no negative cycles then PP does not either

Computing the Shortest Path: Paths in PP

  • Let PP be the pointer graph upon termination of the algorithm
  • Consider the path PvP_v in PP obtained by following the pointers from vv to f(v)=v1f(v) = v_1 to f(v1)=v2f(v_1) = v_2 and so on
  • Claim: PvP_v terminates at tt
  • Claim PvP_v is the shortest path in GG from vv to tt

Bellman-For algorithm: One Array

M[v]=min(M[v],minwNv(cvw+M[w]))M[v] = \min \Big( M[v], \min \limits_{w \in N_v} (c_{vw} + M[w]) \Big)

  • In general, after ii iterations, the path whose length is M[v]M[v] may have many more than ii edges
  • Early termination: if MM does not change after processing all the nodes, we have computed all shortest paths to tt

10 - Network Flow

Motivation

  • Fundamental problems in combinatorial optimization
  • duality between flow and cuts

Flow Networks

  • Use directed graphs to model transport networks:
    • edges carry traffic and have capacities
    • nodes act as switches
    • source nodes generate traffic, sink nodes absorb traffic
  • A flow network is a directed graph G=(V,E)G = (V,E)
    • each edge eEe \in E has a capacity c(e)>0c(e) > 0
    • There is a single source node sVs \in V
    • There is a single sink node sVs \in V
    • Nodes other than s,ts,t are internal

Defining Flow

  • In a flow network G=(V,E)G = (V,E), an sts-t flow is a function f:ER+f: E \rightarrow \Reals^+ such that:
    • (Capacity conditions) for each e inE,0f(e)c(e)e \ in E, 0 \leq f(e) \leq c(e)
    • (Conservation conditions) for each internal node vv, e into Vf(e)=e out of Vf(e)\displaystyle\sum_{e \text{ into } V} f(e) = \displaystyle\sum_{e \text{ out of } V} f(e)
  • The value of a flow is v(f)=e out of f(e)v(f) = \sum_{e \text{ out of } } f(e)

Maximum-Flow Problem

  • Instance: a flow network GG
  • Solution: The flow with the largest value in GG, where the maximum is taking over all possible flows on GG
  • Output should assign a flow value to each edge in the graph
  • The flow on each edge should satisfy the capacity condition
  • The flow into and out of each internal node should satisfy the conservation conditions
  • the value of the output flow, i.e., the total flow out of the source node in the output flow, must be the largest possible over all possible flows in GG
  • Assumptions:
    • Node edges enter ss, no edges leave tt
    • There is at least one edge incident on each node
    • all edge capacities are integers

Developing the Algorithm

  • No known DP algorithm
  • Greedy approach instead
    • Start with zero flow along all edges
    • find an sts-t path and push as much flow along it as possible
    • idea is to increase flow: push flow along edges with leftover capacity and undow flow along edges already carrying flow

Residual graph

  • Give a flow network G=(V,E)G = (V,E) and a flow ff on GG, the residual graph GfG_f of GG w.r.t. ff is a directed graph such that:
    • (Nodes) GfG_f has the same nodes as GG
    • (Forward edges) for each edge e=(u,v)Ee = (u,v) \in E s.t. f(e)<c(e)f(e) < c(e), GfG_f contains the edge (u,v)(u,v) with a residual capacity c(e)f(e)c(e) - f(e)
    • (Backward edges) for each edge eEe \in E s.t. f(e)>0f(e) > 0, GfG_f contains the edge e=(u,v)e' = (u,v) with a residual capacity

Augmenting Paths in a Residual graph

  • Let PP be a simple sts-t path in GfG_f
  • b=bottleneck(P,f)b = bottleneck(P,f) is the minimum residual capacity of any edge in PP
augment(f,P)Let b=bottleneck(P,f)For (u,v)PIf e=(u,v) is a forward edge in Pincrease f(e)G by bElse e=(u,v) is a backward edge in Pdecrease f(e)G by bReturn f\boxed{ \begin{aligned} &\texttt{augment}(f, P) \\ &\text{Let } b = bottleneck(P, f) \\ &\text{For } (u,v) \in P \\ &\quad\text{If } e = (u,v) \text{ is a forward edge in } P \\ &\qquad\text{increase } f(e) \in G \text{ by } b\\ &\quad\text{Else } e = (u,v) \text{ is a backward edge in } P \\ &\qquad\text{decrease } f(e) \in G \text{ by } b\\ &\text{Return } f \\ \end{aligned} }
  • ee is forward in GfG_f iff flow increases along eGe \in G
  • e=(v,u)e = (v,u) is backward in GfG_f iff flow decreases along e=(v,u)Ge = (v,u) \in G

Correctness of augment(f,P)\texttt{augment}(f,P)

  • A simple sts-t path in the residual graph is an augmenting path
  • Let ff' be the flow returned by augment(f,P)\texttt{augment}(f,P)
  • Claim: ff' is a flow. Verify capacity and conservation conditions
    • Only need to check edges and internal nodes in PP
    • Capacity condition on e=(u,v)Gfe = (u,v) \in G_f: note that b=bottleneck(P,f)b = bottleneck(P,f) \leq residual capacity of (u,v)(u,v)
    • ee is a forward edge: 0f(e)f(e)=f(e)+bf(e)+(c(e)f(e))=c(e)0 \leq f(e) \leq f'(e) = f(e) + b \leq f(e) + (c(e) - f(e)) = c(e)

  • ee is a backward edge: c(e)f(e)f(e)=f(e)bf(e)f(e)=0c(e) \geq f(e) \geq f'(e) = f(e) - b \geq f(e) - f(e) = 0

  • Conservation condition on internal node vPv \in P – four cases to work out:

Ford-Fulkerson Algorithm

Max-Flow(G)Initiallyf(e)=0eGwhile there is an s-t path in the residual graph GfLet P be a simple s-t path in Gff=augment(f,P)ffGfGfReturn f\boxed{ \begin{aligned} &\texttt{Max-Flow}(G) \\ &\text{Initially} f(e) = 0 \quad \forall e \in G \\ &\text{while there is an s-t path in the residual graph } G_f \\ &\quad\text{Let } P \text{ be a simple s-t path in } G_f \\ &\quad f' = \texttt{augment}(f, P) \\ &\quad f \leftarrow f' \\ &\quad G_f \leftarrow G_{f'} \\ &\text{Return } f \\ \end{aligned} }
  • Does it terminate? What is it doing?

  • We can see here, the max flow path is sdbcts-d-b-c-t, with a bottleneck of b=10b=10 along, that bdb-d edge gets augmented

  • Repeat

Analysis of the Ford-Faulkerson Algorithm

  • Running time:
    • Does it terminate? If so, how many loops, since each loop is O(m)O(m)
  • Claim at each stage, flow values and residual capacities are integers. Prove by induction
  • Claim: Flow value strictly increase when we apply augment(f,p)\texttt{augment}(f,p)
    • v(f)=v(f)+bottleneck(P,f)>v(f)v'(f) = v(f) + bottleneck(P,f) > v(f)
    • recall every edge that is incident on ss is a forward edge
    • the flow along these edges will increase by bb
  • Claim : Maximum value of any flow C=e out of sc(e)C = \sum_{e \text{ out of } s} c(e)
  • Claim: Algorithm terminates in at most CC iterations
  • Claim: Algorithm runs in O(mC)O(mC) time

Correctness of the Ford-Faulkerson Algorithm

  • How large can the flow be?
  • Can we characterize the magnitude of the flow in terms of the structure of the graph? For example, for every flow f,v(f)C=e out of sc(e)f, v(f) \leq C = \sum_{e \text{ out of } s} c(e)
  • Is there a better bound?
  • Proof strategy:
    • Define sts-t cut, its capacity, and flow in and out of the cut
    • For any sts-t cut, prove any flow \leq its capacity
    • Define a specific sts-t cut at the end of the Ford-Faulkerson algorithm
    • Prove that the flow across this cut equals its capacity

sts-t Cuts and Capacities

  • An sts-t cut is a partition of VV into sets AA and BB such that sAs \in A and tBt \in B
  • Capacity of the cut(A,B)cut(A, B) is

c(A,B)=e out of Ac(e)c(A,B) = \displaystyle\sum_{e \text{ out of } A} c(e)

  • Intuition: For every flow f,v(f)c(A,B)f, v(f) \leq c(A,B)

Fun Facts about Cuts

  • Let ff be any sts-t flow and (A,B)(A,B) any sts-t cut
  • Claim: f(v)=fout(A)fin(A)f(v) = f_{out}(A) - f_{in}(A)
    • v(f)=fout(s)v(f) = f_{out}(s) and fin(s)=0v(f)=fout(s)fin(s)f_{in}(s) = 0 \rArr v(f) = f_{out}(s) - f_{in}(s)
    • For ever other node vAv \in A, 0=fout(v)fin(v)0 = f_{out}(v) - f_{in}(v)
    • Summing up all these equations, v(f)=vA(fout(v)fin(v)v(f) = \sum_{v \in A}(f_{out}(v) - f_{in}(v)
      • An edge ee that has both ends in AA or both ends out of AA does not contribute
      • An edge ee that has its tail in AA contributes f(e)f(e)
      • An edge ee that has its head in AA contributes f(e)-f(e)
    • vA(fout(v)fin(v)=e out of Af(e)e into Af(e)=fout(A)fin(A)\sum_{v \in A} (f_{out}(v) - f_{in}(v) = \sum_{e \text { out of } A} f(e) - \sum_{e \text{ into A} } f(e) = f_{out}(A) - f_{in}(A)
  • Corollary: v(f)=fin(B)fout(B)v(f) = f_{in}(B) - f_{out}(B)

Important Fact about Cuts

  • v(f)c(A,B)v(f) \leq c(A,B)
v(f)=fout(A)fin(A)fout=e out ofAf(e)e out ofAc(e)=c(A,B)\begin{aligned} v(f) &= f_{out}(A) - f_{in}(A) \\ &\leq f_{out} = \displaystyle\sum_{e \text{ out of} A} f(e) \\ &\leq \displaystyle\sum_{e \text{ out of} A} c(e) = c(A,B) \end{aligned}

Max-Flows and Min-Cuts

  • Let ff be any sts-t flow and (A,B)(A,B) any sts-t cut. We proved v(f)c(A,B)v(f) \leq c(A,B)
  • Very strong statement: The value of every flow is \leq capacity of any cut
  • Corollary: The maximum flow is at most the smallest capacity of a cut
  • Question: Is the reverse true? Is the smallest capacity of a cut at most the maximum flow
  • Answer: Yes, the Ford-Faulkerson algorithm computes this cut!

Flows and Cuts

  • Let f\overline f denote the flow computed by the Ford-Faulkerson algorithm
  • Enough to show st\exist s-t cut (A,B)(A^*,B^*) such that v(f)=c(A,B)v(\overline f) = c(A^*, B^*)
  • When the algorithm terminates, the residual graph has no sts-t path
  • Claim: If ff is an sts-t flow such that GfG_f has no sts-t path, then there is an sts-t cut (A,B(A^*, B^* such that v(f)=c(A,B)v(f) = c(A^*, B^*)
    • Claim applies to any flow ff such that GfG_f has no sts-t path, and not just to the flow f\overline f computed by the Ford-Faulkerson algorithm

Proof of Claim Relating Flows to Cuts

  • Claim: ff is an sts-t flow and GfG_f has no sts-t path st\rArr \exist s-t cut (A,B),v(f)=c(A,B)(A^*,B^*), v(f) = c(A^*,B^*)
  • A=A^* = set of nodes reachable from ss in Gf,B=VAG_f, B^* = V - A^*
  • Claim: (A,B)(A^*,B^*) is an sts-t cut in GG
  • Claim: If e=(u,v)e = (u,v) such that uA,vBu \in A^*, v \in B^*, then f(e)=c(e)f(e) = c(e)
  • Claim: If e=(u,v)e' = (u', v') such that uB,vAu' \in B^*, v' \in A^*, then f(e)=0f(e') = 0
  • Claim v(f)=c(A,B)v(f) = c(A^*, B^*)
v(f)=fout(A)fin(A)=e out of Af(e)e into Af(e)=e out of Ac(e)e into A0=c(A,B)\begin{aligned} v(f) &= f_{out}(A^*) - f_{in}(A^*) \\ &= \displaystyle\sum_{e \text{ out of } A^*} f(e) - \displaystyle\sum_{e \text{ into } A^*} f(e) \\ &= \displaystyle\sum_{e \text{ out of } A^*} c(e) - \displaystyle\sum_{e \text{ into } A^*} 0 \\ &= c(A^*, B^*) \\ \end{aligned}

Max-Flow Min-Cut Theorem

  • The flow f\overline f computed by the Ford-Faulkerson algorithm is a maximum flow
  • Given a flow of maximum value, we can compute a minimum sts-t cut in O(m)O(m) time
  • In every flow network, there is a flow ff and a cut (A,B)(A,B) such that v(f)=c(A,B)v(f) = c(A,B)
  • Max-Flow Min-Cut Theorem: in every flow network, the maximum value of an sts-t flow is equal to the minimum capacity of an sts-t cut
  • Corollary: If all capacities in a flow network are integers, then there is a maximum flow ff where f(e)f(e), the value of the flow on edge ee, is an integer for every edge eGe \in G

Real-Valued Capacities

  • If capacities are real-valued, Ford-Faulkerson algorithm may not terminate!
  • But Max-Flow Min-Cut theorem is still true, why?

Improving Ford-Faulkerson Algorithm

  • Bad case for Ford-Faulkerson is when the bottleneck edge is the augmenting path has a low capacity
  • Idea: decrease number of iterations by picking sts-t path with bottleneck edge of largest capacity. Computing this path can slow down each iteration considerably.
  • Modified idea: Maintain a scaling parameter Δ\Delta and choose only augmenting paths with bottleneck capacity at least Δ\Delta
  • Gf(Δ)G_f(\Delta): residual network restricted to edges with residual capacities Δ\geq \Delta

11 - Network Flow Applications

Maximum Flow and Minimum Cut Problems

Matching in Bipartite Graphs

  • Bipartite Graph a graph G(V,E)G(V,E) where
    • V=XY,X,YV = X \cup Y, X,Y are disjoint and
    • EX×YE \subseteq X \times Y
  • Bipartite graphs model situations in which objects are matched with or assigned to other objects: e.g. marraiges, residents/hospitals, jobs/machines
  • A Matching in a bipartite graph GG is a set MM \subseteq of edges such that each node of VV is incident on at most edge of MM
  • A set of edges MM is a perfect matching if every node in VV is incident on exactly one edge in MM
  • Instance: A bipartite graph GG
  • Solution: The matching of largest size in GG

Algorithm for Bipartite Graph Matching

  • Convert GG to a flow network GG': direct edges from XX to YY, add nodes ss and tt connect ss to each node in XX and tt to each node in yy, set all edge capacities to 1
  • Computer the maximum flow in GG'
  • Convert the maximum flow in GG' into a matching in GG
  • Claim: the value of the maximum flow in GG' equals the size of hte maximum matching in GG
  • In general, there is a matching of size kk in GG if an only if there is a (integer valued) flow of kk in GG'

Strategy for Proving Correctness

  • Preclude the possibilities that GG' has a flow of value kk but we cannot construct a matching in GG with kk edges
  • Matching => flow: if there is a matching with kk edges in GG, there is an sts-t flow of value kk in GG'
  • How do we construct this flow?
    • Consider every edge (u,v)(u,v) in the matching: uX,vYu \in X, v \in Y
    • Send on unit of flow along the path suvts \rightarrow u \rightarrow v \rightarrow t
  • Why have we constructed a flow
    • Capacity constraint: No edges receive a flow >1> 1 because we started with a matching
    • Conservation constraint: Every node other than ss and tt has one incoming unit and one outgoing unit of flow because we started with a matching
  • What is the value of the flow? kk, since exactly that many nodes out of ss carry flow

Now let us consider the reverse side of the reduction

  • Flow => matching: if there is a flow ff' in GG' with value kk, there is a matching MM in GG with kk edges
    • There is an integer-valued flow ff' of value k    k \implies flow along any edge is 0 or 1
    • Let MM be the set of edges not incident on ss or tt with flow equal to 1
    • Claim: MM contains kk edges
    • Claim Each node in XX (respectively, YY) is the tail (respectively, head) of at most one edge in MM
  • Conclusion: size of the maximum matching in GG is equal to the value of the maximum flow in GG': the edges in this matching are those that carry flow from XX to YY in GG'

In general, we must prove that the reduction is bi-directional, or isomorphic

Running time of Bipartite Graph Matching Algorithm

  • Suppose GG has mm edges in nn nodes in XX, and in YY
  • CnC \leq n
  • Ford-Fulkerson algorithm runs in O(mn)O(mn) time
  • How do we determine if a bipartite graph GG has a perfect matching? Find the maximum matching and check if it is perfect
  • Suppose GG has no perfect matching. Can we exhibit a short "certificate" of that fact? What can such certificates look like?
  • GG has no perfect matching iff there is a cut in GG' with capacity less than nn. Therefore, the cut is a certificate
  • We would like the certificate in terms of GG
    • For example, two nodes in YY with on incident edge each with the same neighbor in XX
    • Generally, a subset SXS \subseteq X with neighbors Γ(A)Y\Gamma (A) \subseteq Y, such that A<Γ(A)|A| < |\Gamma(A)|
  • Hall's Theorem Let G(XY,E)G(X \cup Y, E) be a bipartite graph such that X=Y|X| = |Y|. Then GG either has a perfect matching or there is a subset AYA \subseteq Y such that A>Γ(A)|A| > | \Gamma (A) |. We can compute a perfect matching or such a subset in O(mn)O(mn) time.
    • Γ(x5)={y4,y5}\Gamma (x_5) = \{y_4, y_5\}, the neighbors of x5x_5

Edge-Disjoint Paths

  • A set of paths in a directed graph GG is edge disjoint if each edge in GG appears in at most one path
  • Instance: Directed graph G(V,E)G(V,E) with two distinguished nodes s,ts,t
  • Solution: The maximum number of edge-disjoiint paths between ss and tt

Mapping to the Max-Flow Problem

  • Convert GG into a flow network with edge capacities of 1
  • Claim: There are kk edge-disjoint sts-t paths from in a directed graph GG if and only if there is a sts-t flow in GG with value k\geq k
  • Paths => flow: if there are kk edge-disjoint paths from ss to tt, send one unit of flowin along each to yield a flow with value kk
  • Flow => paths: Suppose there is an integer-valued flow of value at least kk. Are there kk edge-disjoint paths? If so, what are they?
    • There is an integral flow. Therefore flow on each edge is 0, or 1
    • Claim: if ff is a 0-1 value flow of value v(f)=kv(f) = k, the the set of edges with flow f(e)=1f(e) = 1 contains a set of kk edge-disjoint paths
  • But how do we show that this set of edges with flow value of 1 actually form paths?
    • Claim if ff is a 010-1 valued flow of value v(f)=kv(f) = km then the set of edges with flow f(e)=1f(e) = 1 contains a set of kk edge-disjoint paths
  • Prove by induction on the number of edges in ff that carry flow. Let this number be κ(f)\kappa(f).
    • Base case: v=0v = 0. Nothing to prove
    • Inductive Hypothesis: For every flow ff' in GG with
      • value v(f)<kv(f') < k carrying flow on κ(f)<κ(f)\kappa (f') < \kappa (f) edges or
      • value v(f)=kv(f') = k carrying flow on κ(f)<κ(f)\kappa (f') < \kappa (f) edges, the set of edges with f(e)=1f'(e) = 1 contains a set of v(f)v(f') edge-disjoint sts-t paths
    • Inductive Step: Construct a set of kk sts-t paths from ff. Work out by hand
  • Note: formulating the inductive hypothesis can be trick
  • Strategy is to prove the inductive step first
  • During this proof, we observe two types of "smaller" flows:
    • When you succeed in finding an sts-t path, we get a new flow ff' that is smaller, i.e. v(f)<kv(f') < k carrying flow on fewer edges, i.e. κ(f)<κ(f)\kappa (f') < \kappa (f)
    • When we run into a cycle, we get a new flow ff' with v(f)=kv(f') = k but carrying flow on fewer edges, i.e. κ(f)<κ(f)\kappa (f') < \kappa (f)
  • We combine both situations in the inductive hypothesis

Certificate for Edge-Disjoint Paths Algorithm

  • A set FEF \subseteq E of edges separatees s,ts,t if the graph (V,EF)(V, E-F) contains no sts-t paths
  • Menger's Theorem: In every directed graph with nodes s,ts,t, the maximum number of edge disjoint sts-t paths is equal to the minimum number of edges whose removal disconnes ss from tt.

Running Time of the Edge-Disjoint Paths Algorithm

  • Given a flow of value kk, how quickly can we determine the kk edge-disjoint paths? O(mn)O(mn)
  • Corollary: The Ford-Fulkerson algorithm can be used to find a maximum set of edge-disjoint sts-t paths in a directed graph GG in O(mn)O(mn) time.

Edge-Disjoint Paths in Undirected Graphs

  • Can extend the theorem to undirected graphs
  • Replace each edge with two directed edges of capacity 1 and apply the algorithm for directed graphs
  • Problem: Both counterparts of an undirected edge (u,v)(u,v) may be used by differnt edge-disjoint paths on the directed graph
    • resolved by zeroing out flow for edges where flow goes in both directions
  • Can obtain an integral flow where only one of the directed counterparts of (u,v)(u,v) has non-zero flow
  • We can find the maximum number of edge disjoint paths in O(mn)O(mn) time
  • We can prove a version of Menger's theorem for undirected graphs: in every undirected graph with nodes s,ts,t, the maximum number of edge-disjoint sts-t paths is equal to the minimum number of edges whose removal separates ss from tt

Image Segmentation

  • Fundemental problem in computer vision: segmentaing an image into coherent regions
  • A basic segmentation problem is that of partitioning an image into a foreground and a background: label each pixel in the image as belonging to one of those two
  • Let VV be the set of pixels in an image
  • Let EE be the set of pairs of neighboring pixels
  • VV and EE yield an undirected graph G(V,E)G(V,E)
  • Each pixel ii has a likelihood ai>0a_i > 0 that it belongs to the foreground and a likelihood bib_i that it belongs to the background
  • These likelihoods are specified in the input to the problem
  • We want to fg/bg boundary to be smooth: for each pait (i,j)(i,j) of pixels, there is a separation penalty pij0p_{ij} \geq 0 for placing one of them in the foreground and the other in the background

Formulating the Problem

  • Instance: Pixel graphs G(V,E)G(V,E), likelihood functions a,b:VR+a,b: V \rightarrow \Reals^+, penalty function p:ER+p: E \rightarrow \Reals^+
  • Solution: Optimum labelling: partition of the pixels into two sets A,BA,B that maximizes
q(A,B)=iAai+jBbi(i,j)EA{i,j}=1pij\begin{aligned} q(A,B) &= \displaystyle\sum_{i \in A} a_i + \displaystyle\sum_{j \in B} b_i - \displaystyle\sum_{ (i,j) \in E |A \cap \{i,j\}|=1 } p_{ij} \end{aligned}
  • There is a similarity between cuts and labellings
  • Differences:
    • Maximizing an objective function rather than minimizing
    • There is no source or sink in the segmentation problem
    • We have values on the nodes
    • The graph is undirected

Maximization to Minimization

  • Let Qi(ai+bi)Q - \sum_i (a_i + b_i)
  • Notice that iAaijBbj=QiAbijBaj\sum_{i \in A} a_i \sum_{j \in B} b_j = Q - \sum_{i \in A} b_i - \sum_{j \in B} a_j
  • Therefore, maximizing
q(A,B)=iAai+jBbi(i,j)EA{i,j}=1pij=QiAbijBaj(i,j)EA{i,j}=1pij\begin{aligned} q(A,B) &= \displaystyle\sum_{i \in A} a_i + \displaystyle\sum_{j \in B} b_i - \displaystyle\sum_{ (i,j) \in E |A \cap \{i,j\}|=1 } p_{ij} \\ &= Q - \sum_{i \in A} b_i - \sum_{j \in B} a_j - \displaystyle\sum_{ (i,j) \in E |A \cap \{i,j\}|=1 } p_{ij} \end{aligned}

is identical to minimizing

q(A,B)=iAbi+jBaj+(i,j)EA{i,j}=1pij\begin{aligned} q'(A,B) &= \sum_{i \in A} b_i + \sum_{j \in B} a_j + \displaystyle\sum_{ (i,j) \in E |A \cap \{i,j\}|=1 } p_{ij} \end{aligned}

Solving the Other Issues

  • Solve the other issues like we did earlier:
    • Add a new "super-source" ss to represent foreground
    • Add a new "super-sink" tt to represent the background
    • Connect s,ts,t to every pixel and assign capacity aia_i to edge (s,i)(s, i) and capacity bib_i to edge (i,t)(i,t)
    • Direct edges away from ss and into tt
    • Replace each edge (i,j)(i,j) in EE with two directed edges of capacity pijp_{ij}
  • Now we want the minimum capacity sts-t cut

Cuts in the Flow Network

  • Let GG' be this flow network and (A,B)(A,B) and sts-t cut
  • What does the capacity of the cut represent?
  • Edges crossing the cut can be one of three types:
    • (s,w),wB(s,w), w \in B contributes to awa_w
    • (u,t),uA(u,t), u \in A contributes to bub_u
    • (u,w),uA,wB(u,w), u \in A, w \in B contributes to puwp_{uw}
c(A,B)=iAbi+jBaj+(i,j)EA{i,j}=1pij=q(A,B)\begin{aligned} c(A,B) &= \displaystyle\sum_{i \in A} b_i + \displaystyle\sum_{j \in B} a_j + \displaystyle\sum_{ (i,j) \in E |A \cap \{i,j\}|=1 } p_{ij} = q'(A,B) \end{aligned}
  • So, the capacity of an sts-t cut c(A,B)c(A,B) exactly measure the quantity q(A,B)q'(A,B)
  • To maximise q(A,B)q(A,B), we compute the sts-t cut (A,B)(A,B) of minimum capacity
  • Deleting s,ts,t from the cut yields the desired segmentation of the image
  • Thus, Image Segmentation is reduced to a Minimum Cut problem

12 - NP and Computational Intractibility

  • N=1,P=0N = 1, P = 0

Algorithm Design

  • Patterns:
    • Greed O(nlogn)O(n \log n) interval scheduling
    • Divide and Conquer O(nlogn)O(n \log n) counting inversions
    • Dynamic Programmming O(n3)O(n^3) RNA folding
    • Duality: O(n2m)O(n^2 m) maximum flow and minimum cuts
    • Reductions
    • Local search
    • Randomization
  • Anti-Patterns
    • NP-completeness: O(nk)O(n^k) algorithm unlikely
    • PSPACE-completeness: O(nk)O(n^k) certification algorithm unlikely
    • Undecidability: no algorithm possible

Computational Tractability

  • When is an algorithm an efficient solution to a problem? When the running time is polynomial in the size of the input
  • A problem is computationally tractable if it has a polynomial time algorithm solution
Polynomial timeProbably not
Shortest pathLongest Path
Matching3D Matching
Min cutMax cut
2-SAT3 SAT
Planar four-colorPlanar 3 color
Bipartite vertex coververtex cover
Primarily testingFactoring

Problem Classification

  • Classify problems based on whether they admit efficient solutions or not
  • Some extremely hard problems cannot be solved efficiently (e.g. chess on an n×nn \times n board)
  • However, classification is unclear for a large number of problems
  • We can prove that these problems are fundementally equivalent and are manifestations of the same problem!

Polynomial-Time Reduction

  • Goal is to express statements of the type: "Problem X is at least as hard as problem Y"
  • Use the notion of reductions
  • Y is polynomial-time reducible to XX (YPX)(Y \leq_P X) if any arbitrary instance (input) of YY can be solved using a polynomial number of standard operations, plus one call to a black box that solves problem XX
    • Maximum Bipartite Matching P\leq_P Maxmum sts-t Flow
      • Maximum Bipartite Matching P\nleq_P Maxmum sts-t Flow because you can't reduce a network flow to a bipartite matching...
      • You can get 4 from 3 by adding 1, but you can't get 3 from 4 by adding 1
    • Image Segmentation P\leq_P Minimum sts-t Cut
    • P=\leq_P = polynomial time reducable
  • YPXY \leq_P X implies that "XX is at least as hard as YY"
  • Such reductions are Karp reductions. Cook reductions allow a polynomial number of calls to the black box that solves XX

Usefulness of Reductions

  • Claim: If YPXY \leq_P X and XX can be solved in polynomial time, then YY can be solved in polynomial time
  • Contrapositive: If YPXY \leq_P X and YY cannot be solved in polynomial time, then XX cannot be solved in polynomial time
  • Informally: If YY is hard, and we can show that YY reduces to XX, then the hardness "spreads" to XX

Reduction Strategies

  • Simple equivalence

  • Special case to general case

  • Encoding with gadgets

Optimisiation versus Decision Problems

  • So far, we have developed algorithms that solve optimisation problems
    • Compute the largest flow
    • Find the closest pair of points
    • Find the schedule with the least completion time
  • Now, we turn our attention to decision versions of problems, e.g. is there a flow with value at least kk, for a given value of kk
  • Decision problem: answer to every input is yes or no

Primes

  • Instance: A natural number nn
  • Question: Is nn prime?

Independent Set and Vertex Cover

  • Given an undirected graph G=(V,E)G = (V,E), a subset SVS \subseteq V is an independent set if no two vertices in SS are connected by an edge
    • Instance: Undirected graph GG and an integer kk
    • Question: Does GG contain an independent set of size k\geq k
  • Given an undirected graph G=(V,E)G = (V,E), a subset SVS \subseteq V is a vertex cover if every edge in EE is incident on at least one vertex in SS
    • Instance: Undirected graph GG and an integer kk
    • Question: Does GG contain a vertx cover of size k\geq k
  • Goal: Demonstrate a simple equivalence between these two problems
  • Claim: Independent Set P\leq_P Vertex Cover and Vertex Cover P\leq_P Independent Set
  • N.B. the inverse of an independent set must be a vertex cover

Strategy for Proving Independent Set P\leq_P Vertex Cover

  • Start with an arbitrary instance of IS: an undirected graph G=(V,E)G=(V,E) and an integer kk
  • From G=(V,E)G=(V,E) and kk, create an instance of VC: an undirected graph G=(V,E)G'=(V', E') and an integer kk'
  • GG' is related to GG in some way
  • kk' can depend upon kk and size of GG
  • Prove that G=(V,E)G=(V,E) has an IS of size k\geq k iff GG' has a vertex cover of size k\leq k'
  • Transformation and proof must be correct for all possible graphs GG and all possible values of kk
    • This proof is an iff because we're using a blackbox algorithm for VC to solve IS
      • If there is an independent set size k\geq k, we must be sure that there is a vertex cover of size k\leq k', so that we know that the black box will find this vertex cover
      • If the black box finds a vertex cover of size k\leq k', we must be sure we can construct an independent set of size k\geq k from this vertex cover
  • Arbitrary instance of IS: G,kG, k
  • Let V=n|V| = n
  • Create an instance of VC: G,k=nkG, k' = n - k
  • Claim: GG has an IS of size k\geq k iff GG has a VC of size nk\leq n - k
  • Proof: SS is an IS in GG iff VSV-S is a VC in G
  • The same idea proces that VC P\leq_P IS

Vertex Cover and Set Cover

  • IS is a "packing" problem: pack as many vertices as possible, subject to constraints (the edges)
  • VC is a "covering" problem: cover all edges in the graph with as few vertices as possible

Microbe Cover

  • Instance: A set UU of nn compounds, a collection M1,M2,...,MlM_1, M_2, ..., M_l of microbes, where each microbe can make a subset of compounds in UU, and an integer kk
  • Question: Is there a subset of k\leq k microbes that can together make all the compounds in UU?

Vertex Cover P\leq_P Microbe Cover

  • Input to VC: an undirected graph G=(V,E)G=(V,E) and an integer kk
  • Let V=l|V| = l
  • Create an instance {U,{M1,M2,...,Ml}}\{U, \{ M_1, M_2, ..., M_l\} \} of Microbe cover where:
    • U=EU = E, i.e. each element of UU is an edge of GG
    • For each node iVi \in V, create a microbe MiM_i whose compounds are the set of edges incident on ii
  • Claim: UU can be covered with k\leq k microbes iff GG has a VC cover with k\leq k nodes
  • Proof strategy:
    • If GG has a VC of size k\leq k, then UU can be covered with k\leq k
    • If UU can be covered with k\leq k microbes, then GG has a vertex cover of size k\leq k
  • this is a different form of the Set Cover problem:
    • Instance: A set UU of nn elements, a collection S1,S2,...,SmS_1, S_2, ..., S_m of subsets of UU, and an integer kk
    • Question: Is there a collection of k\leq k sets in the collections whose union is UU?

Boolean Satisfiability

  • Abstract problems formulated in Boolean notation
  • given a set X={x1,x2,...,xn}X = \{ x_1, x_2, ..., x_n\} of nn Boolean variables
  • Each term can take the value of 0,10,1
  • Term: a variable xix_i or its negation xiˉ\bar{x_i}
  • Cause of Length: (or) ll distinct terms t1t2...tlt_1 \lor t_2 \lor ... t_l
  • Truth assignment for XX: is a function v:X{0,1}v: X \rightarrow \{ 0,1 \}
  • An assignment vv satisfies a clause CC if it causes at least one term in CC to evaluate to 1 (since CC is an or of terms)
  • An assignment satisfies a collection of clauses C1,C2,...,CkC_1, C_2, ..., C_k if it causes all clauses to evaluate to 1 i.e. C1C2...Ck=1C_1 \land C_2 \land ... C_k = 1
    • vv is a satisfying assignment with respect to C1,C2,...,CkC_1, C_2, ..., C_k
    • set of clauses C1,C2,...,CkC_1, C_2, ..., C_k is satisfiable

Examples

  • X={x1,x2,x3,x4}X = \{ x_1, x_2, x_3, x_4\}
  • Terms: x1,x1ˉ,x2,x2ˉ,x3,x3ˉ,x4,x4ˉ,x_1, \bar{x_1}, x_2, \bar{x_2}, x_3, \bar{x_3}, x_4, \bar{x_4},
  • Clauses:
    • x1x2ˉx3ˉx_1 \lor \bar{x_2} \lor \bar{x_3}
    • x2x3ˉx4x_2 \lor \bar{x_3} \lor x_4
    • x3x4ˉx_3 \lor \bar{x_4}
  • Assignment: x1=1,x2=0,x3=1,x4=0x_1 = 1, x_2 = 0, x_3 = 1, x_4 = 0
    • Not satisfying because of clauses 1, 3
  • Assignment: x1=1,x2=0,x3=1,x4=0x_1 = 1, x_2 = 0, x_3 = 1, x_4 = 0
    • is satisfying

SAT and 3-SAT

  • Instance: A set of clauses C1,C2,...,CkC_1, C_2, ..., C_k, each of length 3, over a set X={x1,x2,...,xn}X = \{ x_1, x_2, ..., x_n\} of nn variables
  • Question: IS there a satisfying truth assignment for XX with respect to CC
  • SAT and 3-SAT are fundemental combinatorial search problems
  • We have to make nn independent decisions (the assignments for each variable) while satisfying the set constrains
  • Satisfying each constraint in isolation is easy, but we have to make our decision so that all constraints are satisfied simultanesouly

Example

Clauses:

  • C1=x100C_1 = x_1 \lor 0 \lor 0
  • C2=x200C_2 = x_2 \lor 0 \lor 0
  • C3=x2ˉx2ˉ0C_3 = \bar{x_2} \lor \bar{x_2} \lor 0
  • Is C1C2C_1 \land C_2 satisfiable? Yes by x1=1,x2=1x_1 =1, x_2 = 1
  • Is C1C3C_1 \land C_3 satisfiable? Yes by x1=1,x2=0x_1 =1, x_2 = 0
  • Is C2C3C_2 \land C_3 satisfiable? Yes by x1=0,x2=1x_1 =0, x_2 = 1
  • Is C1C2C3C_1 \land C_2 \land C_3 satisfiable? No

3-SAT P\leq_P Independent set

  • C1=x1x2ˉx3ˉC_1 = x_1 \lor \bar{x_2} \lor \bar{x_3}
  • C2=x1ˉx2x4C_2 = \bar{x_1} \lor x_2 \lor x_4
  • C3=x1ˉx3x4ˉC_3 = \bar{x_1} \lor x_3 \lor \bar{x_4}
  • Two ways to think about 3-SAT:
    • Make an independent 0/1 decision on each variable and succeed if we achieve on of three ways in which to satisfy each clause
    • Choose (at least) one term from each clause. Find a truth assignment that causes each chose term to evaluate to 1. Ensure that no two terms selected conflict e.g. select x2ˉ\bar{x_2} in C1C_1 and x2x_2 in C2C_2
  • We are given an instance of 3-SAT with kk clauses of length three over nn variables

  • Construct an instance of Independent Set: graph G=(V,E)G=(V,E) with 3k3k nodes
    • For each clause Ci,1ikC_i, 1 \leq i \leq k, add a triangle of three nodes vi1,vi2,vi3v_{i1}, v_{i2}, v_{i3} and three edges to GG
    • Label each node vij,1j3v_{ij}, 1 \leq j \leq 3 with the jjth term in CiC_i
    • Add an edge between each pair of nodes whose labels correspond to terms that conflict
    • N.B. size of largest IS is kk
    • x4x_4 can be assigned either 0 or 1 to satisfy the clauses.
  • Claim: 3-SAT instance is satisfiable iff GG has an independent set of size kk
  • Satisfiable assignment \rightarrow Independent Set of size kk: Each triangle in GG has at least one node whose label evaluates to 1. Set SS of nodes consisteing of one such node from each triangle forms an independent set of size = kk. Why?
  • Independent Set of size kk \rightarrow Satisfiable assignment: the size of this set is kk. How do we construct a ssatisfying truth assignment from the nodes in the independent set?
  • For each variable xix_i, only xix_i or xiˉ\bar{x_i} is the label of a node in SS.
  • If xix_i is the label of a node in SS, set xi=1x_i = 1; else set xi=0x_i = 0

Transitivity of Reductions

  • Claim: If ZPYZ \leq_P Y and YPXY \leq_P X, than ZPXZ \leq_P X
    • Note that If ZPYZ \leq_P Y, then YPZY \leq_P Z ? False
  • We have shown that 3-SAT P\leq_P Independent Set P\leq_P Vertex Covert P\leq_P Set Cover

Finding vs. Certifying

  • Is it easy to check if a given set of vertices in an undirected graph forms an independent set of size at least kk
  • Is it east to check if a particular truth assignment satisfies a set of clauses?
  • We draw a contrast between finding a solution and checking a solution (in polynomial time)
  • Since we have not been able to develop efficient algorithms to solve many decision problems, let us turn our attention to whether we can check if a proposed solution is correct

Problems and Algorithms

Primes

  • Instance: A natural number nn
  • Question: Is nn prime?
  • Decision problem XX: for every input ss answer X(s)X(s) is yes or no
  • An algorithm AA for a decision problem receives an input ss and returns A(S){yes,no}A(S) \in \{ yes, no \}
  • An algorithm AA solves the problem XX if for every input ss
    • if X(s)=yesX(s) = yes then A(s)=yesA(s) = yes and
    • if X(s)=noX(s) = no then A(s)=noA(s) = no
  • AA has a polynomial running time if there is a polynomial function p()p(\bullet) such that for every input ss, AA terminates on ss in at most O(p(s)O(p(|s|) steps
    • There is an algorithm such that p(s)=s12p(|s|) = |s|^{12} for the Primes problem, improved to s6|s|^6
  • P\mathcal P: is a set of problems XX for which there is a polynomial time algorithm
  • A decision problem XX is in P\mathcal P iff there is an algorithm AA with polynomial running time that solves XX

Efficient Certification

  • A "checking" algorithm for a decision problem XX has a different structure from an algorithm that solves XX
  • Checking an algorithm needs input ss as well as a separate "certificate" tt that contain evidence that X(s)=yesX(s) = yes
  • An algorithm BB is an efficient certifier for a problem XX if
    • BB is a polynomial time algorithm that takes two inputs s,ts,t and
    • for all inputs ss: - X(s)=yesX(s) = yes iff there is a certificate tt such that B(s,t)=yesB(s,t) = yes, and - the size of tt is polynomial in the size of ss
  • Certifier's job is to take a candidate certificate tt that sXs \in X and check in polynomial time whether tt is a correct certififcate
  • Certificate tt must be "short" so that certifier can run in polynomial time
  • Certifier does not care about how to find these certificates

NP\mathcal {NP}

  • P\mathcal P: Set of problems XX for which there is a polynomial time algorithm
  • NP\mathcal NP: Set of all problems for which there exists an efficient certifier
  • 3-SAT P\in \mathcal P
    • Certificate tt: a truth assignment to the variables
    • Certifier BB: checks whether the assignment causes every clause to evaluate to true
  • Independent set P\in \mathcal P
    • Certificate tt: a set of at least kk vertices
    • Certifier BB: checks that no pair of vertices are connected by an edge
  • Set Cover P\in \mathcal P:
    • Certificate tt: a list of kk sets from the collection
    • Certifier BB: checks if the union of these sets is UU

P\mathcal {P} vs. NP\mathcal {NP}

  • Claim: PNP\mathcal{P} \subseteq \mathcal{NP}
    • Let XX be any problem in P\mathcal{P}
    • There is a polynomial time algorithm AA that solves XX
    • BB ignores tt and simply returns A(s)A(s). Why is BB an efficient certifier
      • BB just has to check the certificate, not compute it
  • Is P=NP\mathcal{P} = \mathcal{NP} or is NPP0\mathcal{NP} - \mathcal{P} \neq 0. Major unsolved problem in CS

Summary

  • PNP\mathcal{P} \subseteq \mathcal{NP}
  • 3-SAT, Vertex Cover, Set Cover, Independent Set are in NP\mathcal{NP}
  • 3-SAT P\leq_P Independent Set P\leq_P Vertex Cover P\leq_P Set Cover
  • What is the structure of the problems in NP\mathcal{NP}
    • Is there a sequence of problems X1,X2,X3,...X_1, X_2, X_3, ... in NP\mathcal{NP}, such that X1PX2X3...X_1 \leq_P X_2 \leq X_3 ...
    • Are there two problems X1X_1 and X2X_2 in NP\mathcal{NP} such that there is no problem XNPX \in \mathcal{NP} such that there is no problem

NP\mathcal {NP}-Complete and NP\mathcal {NP}-Hard problems

  • A problem XX is NP\mathcal {NP}-Complete if

    • XNPX \in \mathcal{NP} and
    • for every problem YNP,YPXY \in \mathcal NP, Y \leq_P X
  • A problem XX is NP\mathcal {NP}-Hard if

    • for every problem YNP,YPXY \in \mathcal NP, Y \leq_P X
  • Claim: Suppose XX is NP\mathcal {NP}-Complete, then XPX \in \mathcal P iff PNP\mathcal P \in NP

  • Corollary: If there is any problem in NP\mathcal NP that cannot be solved in polynomial time, then no NP\mathcal NP-Complete problem can be solved in polynomial time

  • Does even one NP\mathcal NP-Complete problem exist?! If it does, how can we prove that every problem in NP\mathcal NP reduces to this problem

Circuit Satisfiability

  • Cook-Levin Theorem: Circuit Satisfiability is NP\mathcal NP-Complete
  • A circuit KK is a labelled, directed acyclic graph such that
    • the sources in KK are labelled with constants (0 or 1) or the name of a distinct variable (the inputs of the circuit)
    • every other node is labeled with one Boolean operator ,,¬\land, \lor, \neg
    • a single node with not outgoing edges represents the output of KK
  • Instance: A circuit KK
  • Question: Is there a truth assignment to the inputs that causes the output to have value 1?

13 - NP-Complete Problems

Proving Other Problems NP\mathcal NP-Complete

  • Claim: If YY is NP\mathcal NP-Complete and XNPX \in \mathcal NP such that YPY \leq_P, then XX is NP\mathcal NP-Complete
  • Recall that XX is NP\mathcal NP-Complete if
    • XX is in NP\mathcal NP and
    • for every problem ZZ in NP\mathcal NP, ZPXZ \leq_P X
  • Given a new problem XX, a general strategy for proving it NP\mathcal NP-Complete is
    • Prove that XNPX \in \mathcal NP
    • Select a problem YY known to be NP\mathcal NP-Complete
    • Prove that XPYX \leq_P Y

3-SAT is NP\mathcal NP-Complete

  • Why is 3-SAT in NP\mathcal NP
  • Circuit Satifiability P\leq_P 3-SAT
    • Given an input to Circuit Satisfiability, create an input to SAT in which each clause has at most 3 variables
    • Convert this input to SAT into an input to 3-SAT

More NP\mathcal NP-Complete Problems

  • Circuit Satisfiability is NP\mathcal NP-Complete
  • We just show Circuit Satisfiability P\leq_P 3-SAT
  • 3-SAT P\leq_P IS P\leq_P VC P\leq_P Set Cover
  • All of these problems are in NP\mathcal NP
  • Therefore, all NP\mathcal NP-Complete problems are reducible to each other

Hamiltonian Cycle

  • Problems we have seen so far involve searching over subsets of a collection of objects
  • Another type of computationally hard problem involves searching over the set of all permutations of a collection of objects
  • In a directed graph G=(V,E)G=(V,E), a cycle CC is a Hamiltonian Cycle if CC visits each vertex exactly once
  • Instance: A directed graph GG
  • Question: Does GG contain a Hamiltonian Cycle?

Hamiltonian Cycle is NP\mathcal NP-Complete

  • Claim: 3-SAT P\leq_P Hamiltonian Cycle
  • Proof coming up

Travelling Salesman Problem

  • A salesman must visit nn cities v1,v2,...,vnv_1, v_2, ..., v_n starting at home city v1v_1
  • Salseman must find a tourtour, an order in which to visit each city exactly once and return home
  • Goal is to find as short a tour as possible
  • For every pair vi,vjv_i, v_j d(vi,vj)>0d(v_i, v_j) > 0 is the distance between them
  • a tour is a permutation of v1,v2,...,vnv_1, v_2, ..., v_n
  • The length of a tour is j=1n1=d(vij,vij+1+d(vin,vi1)\sum_{j=1}^{n-1} = d(v_{i_j}, v_{i_{j+1}} + d(v_{i_n}, v_{i_1})
  • Instance: A set VV of nn cities, a function d:V×VR+d : V \times V \rightarrow \Reals^+, anda number D>0D > 0
  • Question: Is there a tour of length at most DD

Travelling Salesman is NP\mathcal NP-Complete

  • Why is the problem in NP\mathcal NP
  • Why is the problem NP\mathcal NP-Complete
  • Claim: Hamiltonian Cycle P\leq_P Trevelling Salesman
HCTSP
Directed Graph G=(V,E)G=(V,E)Cities
Edges have identical weightsDistance between cities can vary
Not all pairs of nodes are connected in GGEvery pair of cities has a distance
(u,v)(u,v) and (v,u)(v,u) may both be edges

14 - Coping with NP-Completeness

  • These problems come up in real life
  • NP\mathcal{NP}-Complete means that a problem is hard to solve in the worse case. Can we come up with better solutions at least in some cases?
    • Develop algorithms that are exponential in one parameter in the problem
    • Consider special cases of the input e.g. graphs that "look like" trees
    • Develop algorithms that can provably compute a solution close to the optimal

Vertex Cover

  • Instance: Undirected graph GG and an integer kk
  • Question: Does GG contain a vertex cover of size at most kk?
  • Problem has two parameters: k,nk,n
  • What is the running time of a brute force algorithm: O(kn(nk))=O(knk+1)O(kn {n \choose k}) = O(kn^{k+1})
  • Can we devise an algorithm whose running time is exponential in kk but polynomial in nn e.g. O(2kn)O(2^kn)
    • Fixed parameter tractable

Designing the Vertex Cover Algorithm

  • Intuition: if a graph has a small Vertex Cover, it cannot have too many edges
  • Claim: If GG has nn nodes and GG has a vertex cover of size at most kk, then GG has at most knkn edges
  • Easy part of algorithm: Return no if GG has more than knkn edges
  • G{u}G - \{ u \} is the graph GG w/o the node uu and the edge incident on uu
  • Consier an edge (u,v)(u,v). Either uu or vv must be in the vertex cover
    • Claim: GG has a vertex cover of size at most kk iff for any edge (u,v)(u,v) either G{u}G - \{ u \} or G{v}G - \{ v \} has a vertex cover of size at most k1k-1

Analysing the Vertex Cover Algorithm

  • Develop a recurrence relation for the algorithm with parameters
  • Let T(n,k)T(n,k) denote the worst-case running time of the algorithm on an instance of Vertex Cover with parameters n,kn,k
  • T(n,1)cnT(n, 1) \leq cn
  • T(n,k)2T(n,k1)+cknT(n, k) \leq 2T(n, k-1) + ckn
    • We need O(kn)O(kn) time to count the number of edges
  • Claim: T(n,k)=O(2kkn)T(n,k) = O(2^kkn)

Solving NP\mathcal{NP}-Hard Problems on Trees

  • NP\mathcal{NP}-Hard: at least as hard as NP\mathcal{NP}-Complete. We use NP\mathcal{NP}-Hard to refer to optimisation version of decision problems
  • Many NP\mathcal{NP}-Hard problems can be solved efficiently on tree
  • Intuition: subtree rooted at any node vv pf the tree interacts with the rest of the treee only through vv. Therefore, depending on whether we include vv in the solution or not, we can decouple solving the problem in vv's subtree from the rest of the tree

Approximation Algorithms

  • Methods for optimisation version of NP\mathcal{NP}-Complete Problems
  • Run in polynomial time
  • Solution returned is guaranteed to be within a small factor of the optimal solution

Approximation Algorithm for Vertex Cover

Initially, C,E,While G has at least one edge Let (u,v) be any edge in GAdd u,v to CGG{u,v}Add (u,v) to EReturn C\boxed{ \begin{aligned} &\text{Initially, } C, E' \leftarrow \empty, \empty \\ &\text{While } G \text{ has at least one edge } \\ &\quad\text{Let } (u,v) \text{ be any edge in } G \\ &\quad\text{Add } u,v \text{ to } C \\ &\quad G \leftarrow G - \{u,v\} \\ &\quad\text{Add } (u,v) \text{ to } E' \\ &\text{Return } C\\ \end{aligned} }
  • Running time is linear on the size of the graph
  • Claim: CC is a vertex cover
  • Claim: No two edges in EE' can be covered by the same node
  • Claim: The size cc^* of the smallest vertex cover is at least E|E'|
  • Claim: C=2E2c|C| = 2|E'| \leq 2c^*
  • No approximation algorithm with a factor better than 1.3606 is possible unless P=NP\mathcal{P=NP} (Dinur, Safra 2006)
  • No approximation algorithm with a factor better than 2 is possible if the "unique games conjecture" is true (Khot, Regev 2008)

Load Balancing Problem

  • Given a set of mm machines M1,M2,...,MmM_1, M_2, ...,M_m
  • Given a set of nn jobs: job jj has processing time tjt_j
  • Assign each job to one machine so that the time spent is minimised
  • Let A(i)A(i) be the set of jobs assigned to machine MiM_i
  • Total time spent on machine ii is Ti=kA(i)tkT_i = \sum_{k \in A(i)} t_k
  • Minimise makespan T=maxiTiT = \max_i T_i, the largest load on any machine
    • This is NP\mathcal{NP}-Complete
AlgorithmhereAlgorithm here

Lower Bounds on the Optimal Makespan

  • Need a lower bound on the optimum makespan TT^*
  • The two bound below suffice:
T1mjtjmaxjtj\begin{aligned} T^* &\geq \frac{1}{m} \displaystyle \sum_j t_j \\ &\geq \max \limits_j t_j \end{aligned}
  • Claim: computed makespane T2TT \leq 2T^*
  • Let MiM_i be the machine whose load is TT and jj be the last job placed on MiM_i
  • What was the situation just before placing this job?
    • MiM_i had the smallest load and its load was TtjT - t_j
    • For every machine MkM_k, load TkTtjT_k \geq T - t_j
kTkm(Ttj) k ranges over all machines jtjm(Ttj) j ranges over all jobs Ttj1/mjtjTT2T, since tjT\begin{aligned} \displaystyle \sum_k T_k &\geq m(T - t_j) \text{ k ranges over all machines }\\ \displaystyle \sum_j t_j &\geq m(T - t_j) \text{ j ranges over all jobs } \\ T - t_j &\leq 1/m \displaystyle \sum_j t_j \leq T^* \\ T &\leq 2T^*, \text{ since } t_j \le T^* \end{aligned}

Improving the Bound

  • It is easy to construct an example for which the greedy algorithm produces a solution close to a factor of 2 away from the optimal
  • How can we improve?
  • What if we process the jobs in decreasing order of processing time
SortedBalancealgorithmSorted Balance algorithm

Analyzing Sorted-Balance

  • Claim: if there are fewer than mm jobs, algorithm is optimal
  • Claim: if there are more than mm jobs, then T2tm+1T^* \geq 2t_{m+1}
    • Consider only the first m+1m+1 jobs in sorted order
    • Consider any assignment of these m+1m+1 jobs to machines
    • Some machine must be assigned two hobs, each with processing time tm+1\geq t_{m+1}
    • This machine will have a load at least 2tm+12t_{m+1}
  • Claim: T3T/2T \leq 3T^*/2
  • Let MiM_i be the machine whose load is TT and jj be the last job placed on MiM_i (MiM_i has at least two jobs)
tjT/2 since jm+1TtjT from Greedy-Balance proofT3T/2\begin{aligned} t_j &\leq T^*/2 \text{ since } j \geq m +1\\ T - t_j &\leq T^* \text{ from Greedy-Balance proof}\\ T &\leq 3T^*/2 \end{aligned}
  • There is an even better bound T4T/3T \leq 4T^* / 3
  • polynomial-time approximation scheme: for every ϵ>0\epsilon >0, compute solution with makespan T(1+ϵ)TT \leq (1 + \epsilon)T^* in O((n/ϵ)1/ϵ2)O((n/\epsilon)^{1/\epsilon^2})

Partition Problem

  • Instance: a set of nn natural numbers w1,w2,...,wnw_1, w_2, ..., w_n
  • Solution: a subset SS of numbers such that iSwi=iSwi\sum_{i \in S} w_i = \sum_{i \notin S} w_i

Subset Sum Problem

  • Instance: a set of nn natural numbers w1,w2,...,wnw_1, w_2, ..., w_n and a target WW
  • Solution: a subset SS of numbers such that iSwi\sum_{i \in S} w_i is maximized subject to the constraint iSwiW\sum_{i \in S} w_i \leq W
  • OPT(i,w)OPT(i, w) is the largest sum possible using only the first ii numbers with target ww
OPT(i,w)=OPT(i1,w),i>0,wi>wOPT(i,w)=max(OPT(i1,w),wi+OPT(i1,wwi)),i>0,wiwOPT(0,w)=0\begin{aligned} OPT(i, w) &= OPT(i-1, w), \quad i > 0, w_i > w \\ OPT(i, w) &= \max(OPT(i-1, w), w_i + OPT(i-1, w - w_i)), \quad i > 0, w_i \leq w\\ OPT(0, w) &= 0 \end{aligned}
  • Running itme is O(nW)O(nW)

  • All of these are NP\mathcal{NP}-Complete: 3D-Matching P\leq_P Partition P\leq_P Subset Sum P\leq_P Knapsack

Knapsack Problem

  • Instance: a set of nn elements, with each element ii having a weight wiw_i and a value viv_i, and a knapsack capacity of WW
  • Solution: a subset SS of items such that iSvi\sum_{i \in S} v_i is maximized subject to the constraint iSwiW\sum_{i \in S} w_i \leq W

Dynamic Programming for Knapsack

  • Can generalize the DP program for Subset Sum
  • Develop a different DP program that will be useful later
  • OPT(i,v)OPT(i,v) is the smallest knapsack weight so that there is a solution using only the first ii items with total value v\geq v
  • What are the ranges of i,vi,v?
    • ii ranges between 0,n0,n, the number of items
    • Given i,vi,v ranges between 0,1jivj0, \sum_{1 \leq j\leq i} v_j
    • Largest value of vv is 1jnvnnv\sum_{1 \leq j \leq n}v_n \leq nv^* where nv=maxivinv^* = \max_i v_i
  • The solution we want is the largest value vv such that OPT(n,v)WOPT(n,v) \leq W
  • OPT(i,0)=0OPT(i,0) =0 for every i1i \geq 1
  • OPT(i,v)=max(OPT(i1,v),wi+OPT(i1,vvi))OPT(i,v) = \max (OPT(i-1, v), w_i + OPT(i-1, v-v_i)), otherwise
  • Can find items in the solution by tracing back
  • Running time is O(n2v)O(n^2v^*) which is pseudo-polynomial in the input size

Intuition underlying Approximation Algorithm

  • What is the running time if all values are the same
  • What if they're all small integers
  • Idea:
    • round and scale all the values to lie in a smaller range
    • run the DP algorithm with the modified new values
    • Return items in this optimal solution
    • prove that the value of this solution is not much smaller than the true optimum

Approximation Scheme for Knapsack

  • 0<ϵ<10 < \epsilon < 1 is a precision parameter: assume that 1/ϵ1/ \epsilon is an integer
  • Scaling factor θ=ϵv2n\theta = \frac{\epsilon v^*}{2n}
  • for every item ii, and set:
    • vi~=viθθ\widetilde{v_i} = \lceil \frac{v_i}{\theta} \rceil \theta
    • vi^=viθ\widehat{v_i} = \lceil \frac{v_i}{\theta} \rceil
  • algorithm:
    • solve knapsack problems using the DP program with the values vi^\widehat{v_i}
    • Return the set SS of items found
    • Running time of Knapsack-Approx? O(n2maxivi^)=O(n2v/θ)=O(n3/ϵ)O(n^2 \max_i \widehat{v_i}) = O(n^2v^*/\theta) = O(n^3/\epsilon)
      • Have to include ϵ\epsilon here because it determines the size of the input
    • We need to show that the value of the solution returned by Knapsack-Approx is good

Approximation Guarantee for Knapsack-Approx

  • Let SS be the solution computed by Knapsack-Approx
  • Let SS^* be any other solution satisfying jSwjW\sum_{j \in S^*} w_j \leq W
  • Claim: iSvijSvj\sum*{i \in S} v_i \geq \sum*{j \in S^*} v_j
  • Since Knapsack-Approx is optimal for the values vi~\widetilde{v_i}:

iSvi~jSvj~\sum_{i \in S} \widetilde{v_i} \geq \sum_{j \in S^*} \widetilde{v_j}

  • Since for each i,vivi~vi+θi, v_i \leq \widetilde{v_i} \leq v_i + \theta ,
jSvjjSvj~iSvi~iSvi+nθ=iSvi+ϵv2\sum_{j \in S^*} v_j \leq \sum_{j \in S^*} \widetilde{v_j} \leq \sum_{i \in S} \widetilde{v_i } \leq \sum_{i \in S} v_i + n \theta = \sum_{i \in S} v_i + \frac{\epsilon v^*}{2}
  • Apply argument to SS^* containing only the items with the largest value:

mathmath

  • Therefore

jSvjiSvi+ϵv2(1+ϵ)iSvi\sum_{j \in S^*} v_j \leq \sum_{i \in S} v_i + \frac{\epsilon v^*}{2} \leq (1 + \epsilon) \sum_{i \in S} v_i

  • The running time can be improved to O(nlog21ϵ+1ϵ4)O(n \log_2 \frac{1}{\epsilon} + \frac{1}{\epsilon^4})

15 - Solutions

Homework 1

Homework 2

Homework 3

Homework 4

Midterm

Homework 5

Homework 6

Homework 7

Final