Graph Representation

Want the following out of a graph data structure: - Count vertices: O(1) - Iterate through all vertices: O(1) per vertex - Associate information with vertices: O(1) per get/set - Iterate through all edges: O(1) per edge - Get degree of vertex: O(1) - Count all edges: O(1) - Iterate through edges in/out of a vertex: O(1) per edge - Associate information with edges: O(1) per get/set - Test whether two vertices are connected by an edge: O(1)

Decorator Pattern and Adjacency Lists

Possible solutions:

  1. Dictionary, with vertices as keys
  2. Number vertices and index an array using those numbers
  3. Separate variables for vertex objects for each piece of information
  4. One dictionary variable per vertex with keys for each piece of information
  • An adjacency list uses a dictionary of vertices as keys and the list of outgoing neighbors as values
    • There are lots of different variations on this structure; different objects for vertices and edges
  • The Python-style dictionary representation satisfies the time constraints needed for an efficient PageRank algorithm
    • Some operations are not supported; e.g. looping through incoming edges can be slow

BFS/DFS

  • Implicit graph: A graph whose edges and vertices are not already given and must be found through an algorithm (like BFS)
  • A path is an alternating sequence of endpoints and edges that does not allow repetition
  • A walk is the same as a path but allows repetition
  • Reachability: Does there exist a path between two nodes? If so, then they are reachable from each other

Minimum Spanning Trees

  • Given an undirected graph, find a tree that connects all vertices and minimizes the total weight
  • Disconnected graphs will output a spanning forest as opposed to a tree
  • Minimize and maximize are the same problem; simply make weights negative

Applications

  • Build an optimal power grid/transportation routes with minimal total construction cost
  • Build a computer network with maximum badwidth - maximum spanning tree
  • The minimum spanning tree has the smallest possible bottle neck out of all trees; maximum has the largest

Properties

  • Assumes that there are no “tied” edges, and if they do have the same edge, then use a consistent tiebreaking method
  • Cycle property: If C is a cycle in the original graph G, then the heaviest edge in this cycle cannot be in the MST
    • In other words: If C is any cycle and e is its heaviest edge, then an MST that contains e will not be the MST
  • Cut property: If we cut the vertices of the graph into any two subsets of vertices X and G - X, and e is the lihtest edge with endpoints in both subsets, then e must be in the MST
    • Can be used to derive algorithms
  • Path property: For any two vertices x and y, the MST path from x to y has the minimum possible weight for its heaviest edge

Algorithms

  • Most algorithms to find an MST takes O(n log n) time and utilize the cut property
  • Algorithms have been found that take close to linear time or expected linear time but randomized

Jarnik’s Algorithm (Prim-Dijkstra)

  • Start with some vertex S; this will be the start of the MST, named T
  • Repeat until T contains all vertices:
    1. Partition the graph into T and G - T
    2. Find the minimum-weight edge e connecting T and G - T
    3. Add e to T (as well as its corresponding vertex)
  • Can be implemented via a priority queue
    • Store vertices not yet in T by the weight of the minimum-weight edge that would connect it to T
    • Update the priority queue each time you update T by looping through the edges
  • Using a binary heap, all find/remove operations take O(log n) time, so the total time is O(m log n) because each edge is looped over twice
  • A Fibonacci heap can get this down to O(m + n log n)

Kruskal’s Algorithm

  • Start by creating a forest of one-vertex trees
  • Sort the edges by weight
  • For each edge, if it connects two different trees, conenct the trees and add the edge to the MST
  • O(m log m) for sorting, O(m) to iterate through list; O(m log m) total
  • Can use the union-find data structure for near-constant lookups of forests

Boruvka’s Algorithm (Sollin)

  • Start by creating a forest of one-vertex trees
  • Label vertices by their tree
  • While there is more than one tree, find the minimum-weight outgoing edge from each forest
    • Add all of these minimum-weight edges to the T and connect the forests using them
  • The number of trees will go down by a factor of 2 or more, so we will do this iteration log n times
  • Looping through all m edges per iteration means the runtime is O(m log n)

Hybrid Algorithm

  • Run Boruvka’s algorithm until the number of trees is less than or equal to n / log n
  • Run Jarnik’s algorithm with Fiboniacci heaps, using a priority queue on the trees as opposed to the vertices
  • O(m log log n) from Boruvka and O(m) from Jarnik means that the total runtime is O(m log log n)

Topological Ordering

  • Topological ordering: A list of all vertices in G such that, for every edge $x \rightarrow y$, x is before y in the list
  • Topological sorting: An algorithm to compute a topological ordering
  • Graphs that have cycles cannot have a topological ordering
    • Thus, only directed acyclic graphs can have a topological ordering
  • Can use a simple greedy algorithm; add vertices with no incoming edges to the ordering, and remove its outgoing edges from the graph
    • If there is ever a point where there are no “ready” vertices but there are still vertices left to process, then there is a cycle
    • To find the cycle: go to a remaining vertex and walk forward until a vertex is repeated; the sequence between the repititions is the cycle

Shortest and Longest Paths

  • Shortest path can found by BFS; longest path is NP-hard because it includes finding a Hamiltonian path
    • When there exists negative paths, then shortest path is NP-hard
  • Shortest walk is not possible if the graph has a negative cycle; can keep walking around the cycle and reducing total weight
  • For finding shortest paths in directed graphs:
    • If graph is DAG, use topological ordering (O(m))
    • If graph has nonnegative edge lengths, use Dijkstra’s (close to linear time)
      • Works on undirected graphs
    • If graph has negative edges but no negative cycle, use Bellman-Ford algorithm (O(mn))
      • Can also detect negative cycles if one exists
    • Bellman-Ford is optimal and can be improved based on recent results
    • Dijkstra’s is “universally optimal” for a fixed graph with variable, positive weights

Critical Path Scheduling (PERT method)

  • Input: a project consisting of multiple tasks that take different amounts of time
  • Some tasks can be done simultaneously; others rely on earlier tasks to be completed
  • Goal: Find a schedule for the tasks that finishes the project in the lowest possible time
    • AKA longest path finder
  • Can use an activity-on-edge graph; tasks are converted to edges, and vertices are project milestones
    • Constraints are represented by directed edges and have zero weight
    • The critical path is the longest path in this graph from start to end and is the minimum possible time for the schedule
    • Assign times to vertices by making sure that the difference between stages is greater than or equal to the length of the edge between them
  • To calculate the optimal schedule, we can use a topological ordering
    • Set the path length of the starting vertex to 0
    • Find the largest incoming edge $u\rightarrow v$ and set the path length of v to $L(u) + weight(u \rightarrow v)$
  • Can find the smallest incoming edge to get the shortest paths

Shortest Path Trees and Relaxation Algorithms

  • In graphs without negative cycles, we can generate a tree of shortest paths
  • If s is the starting vertex, then other vertices will have parents that are the vertices that come before them in their shortest path
    • e.g. if the shortest path from x to c is $x \rightarrow b \rightarrow c$, then b is a parent of c in the tree
  • Input: A graph with edge lengths and a starting vertex
  • Output: Tree of shortest paths from s to all other reachable vertices
    • Two decorations per vertex: the parent and distance from start vertex
  • General idea: use the two decorations and update them when shorter paths are found
  • Invariants
    • D is the length of some path to x that is greater than or equal to the correct value
    • P is the second-to-last vertex on a path of length ≤D
  • Initialize by setting P(x) = None and D(x) = 0 if the vertex is s; otherwise D(x) = infinity
  • For each edge uv, check whether the distance to u + the length of uv is less than the current shortest path for v, then update if so
    • Known as relaxing an edge
def relax(u, v):
    if D[u] + length(edge uv) < D[v]:
        D[v] = D[u] + length(edge uv)
        P[v] = u
  • If the graph is a DAG:
    • Simple algorithm is to go through vertices in topological order and relaxing each vertices’ edges
    • Total time is O(m)
initialize D, P
for v in topological order:
    for incoming edges uv:
        relax(u, v)

Bellman-Ford Algorithm

  • Uses relaxation paradigm and takes O(mn) time
initialize D, P
repeat n-1 times:
    for each edge uv in the whole graph:
        relax(u, v)
  • Can be improved based on ordering of nodes or stopping early; still results in same big O

Dijkstra’s Algorithm

  • Topologically sorts the shortest path tree by sorting vertices by distance
    • Uses a priority queue to order on the fly
initialize D, P
make priority queue Q of vertices, prioritized by D[v]
while Q is non-empty:
    find and remove minimum-priority vertex v in Q
    for each edge vw:
        relax(vw)
  • Takes O(m log n) with a binary heap or O(m + n log n) with a Fibonacci heap

Changing Weights without Changing Optimum Paths

  • We can add another decoration to the edges, a height h(x), to reweight the edges
  • Each edge has two properties: length and height
    • New length: $l_h(u\rightarrow v) = l(u\rightarrow v) - h(u) + h(v)$
    • Downhill edges have decreased length, uphill edges have increased length
  • Heights can eliminate negative edges and allow Dijkstra’s to work
    • Can now use Johnson’s and Suurballe’s algorithm
    • Won’t get rid of negative cycles
  • Can also change the order of computation
    • A* algorithm uses this fact

Johnson’s Algorithm

  • Input: A graph G with negative edges but not negative cycles
  • Output: Shortest paths between all pairs of vertices
  1. Add a start vertex s that is connected to all other vertices by zero-weight edges
  2. Use Bellman-Ford only once to compute the shortest paths from s to all other vertices
  3. Reweight using the absolute value of distance from s as the height (guarantees that all edges are non-negative)
  4. Use Dijkstra from each starting vertex in the reweighted graph
  • BF takes O(mn), reweighting takes linear time, Dijkstra is run n times; $O(mn + n^2 \text{log } n)$ time

Suurballe’s Algorithm

  • Input: A weighted directed graph and two endpoint vertices, s and t
  • Output: Find two paths from s to t such that the paths don’t share edges and minimize the sum of the length of both
  • Can find two shortest paths by finding one shortest path, then adding “reverse” edges with negative weights from the first shortest path
    • If both shortest paths take the same edge, then the edge will be cancelled out
  1. Use Dijkstra to find single-source paths and distances from s
  2. Reweight the edges if they belong to a shortest path to equal 0
  3. Find shortest path P from s to t and reverse its edges
  4. Find a new shortest path in the modified graph
  5. Remove any edges visited in both paths and reconnect the paths
  • Two runs of Dijkstra, linear time to reweight and reverse; total runtime of O(m + log n)

A*

  • Want to traverse paths to destination with Dijkstra without going down bad paths
  • Idea: choose a height function for each vertex v that is easily to calculate and accurately estimates the distance from v to t
    • Must be an underestimate so paths to t eventually go downhill
    • Can’t decrease too quickly; height between two vertices differs by at most the length of the edge between them
    • AKA needs to be admissible
  • Reweight the graph by height function
    • Edges will be non-negative because of our height function
    • Edges with smaller estimates appear shorter
  • Run Dijkstra on this reweighted graph
    • Same runtime as Dijkstra, but with linear time added to reweight

Widest Paths

  • Input: A graph with weighted edges
  • Output: Path from start to end with the maximum minimum weight (i.e. the path with the highest throughput, as given by the minimum edge weight on the path)

Linear Algorithm

  • For undirected graphs: if there is a single source and a single destination, the path can be found in linear time
    1. Find median edge weight
    2. If a path can be made using heavier-than-median edges, get rid of lighter edges and recurse
    3. If a path cannot be made in this fashion, collapse connected components of wide edges into “supervertices” and recurse; combine edges
  • Linear time because $\Sigma \frac{m}{2^{i+1}} = O(m)$
  • For directed graphs: we can use an algorithm that divides edges into subsets and runs linear time Dijkstra on these subsets in order to get a runtime of O(m log*n)

Dijkstra

  • We can modify the relax algorithm to prioritize widths of paths
def relax(u, v):
    if v in Q and min(D[u], length(uv)) > D[v]:
        D[v] = min(D[u], length(uv))
        P[v] = u
  • Same runtime as regular Dijkstra’s: O(m + n log n)
    • If edges are already sorted, then this algorithm runs in linear time using an integer priority queue

All-Pairs Directed Widest Paths

  1. Sort all the edges before finding paths (O(m log m))
  2. Run linear Dijkstra on each vertex (O(mn))
  3. Total runtime: O(mn + m log m)
  • Can also use fast matrix multiplication if the number of edges is large in $O(n^{2.688})$ time

Applications: Voting Systems

  • Instant runoff voting: Eliminate the candidate with the fewest amount of voters who have them as their favorite until a candidate has a majority
  • Borda count: With c candidates, give candidates points based on their positioning in a voter’s preference ordering (first = c-1, second = c-2, last = 0); candidate with the most points wins
  • Desirable properties
    • Majority: Candidate with a majority of first-place votes wins
    • Weak majority: Candidate with all first-place votes wins
    • Symmetry: All voters are treated equally
    • Weak Symmetry: There is no super-voter whose preference decides the election
    • No spoilers: If candidate X does not win the election, then removing X does not change the outcome (think third-parties)
  • Impossible to satisfy weak symmetry, weak majority, and no spoilers

Condorcet Property / Schulze Method

  • Condorcet Property: If there is a candidate X who wins every head-to-head comparison, then X should win the election
    • If a cycle exists, then there is no apparent winner (i.e. A beats B, B beats C, C beats A)
  • Generate a graph where an edge from candidate U to V is weighted with the number of voters who prefer U to V
  • Relabel each edge X to Y with the width of the widest path from X to Y in the original graph
  • There is guaranteed to be a winner in this graph
  • With randomized quickselect, this method takes $O(candidates^2 \cdot \log candidates)$

Image

Euler Tours

  • Defined as a walk through a graph such that each edge is visited once
  • Note that, for any finite graph, there should be an even number of odd-degree vertices, as the sum of degrees should be even
    • A graph with no odd vertices has an Euler tour that starts and ends at the same vertex
    • A graph with two odd vertices has an Euler tour that starts at one of the odd vertices and ends at the other

Fleury’s Algortihm

  • Greedy algorithm: Keep adding edges without adding an edge that connects two uncovered parts of the graph
  • Basic implementation is $O(m^2)$; check all edges when adding each edge
  • Advanced implementations have better times, but still not linear

Hierholzer’s Algorithm

  • Keep adding edges without checking whether or not we add an edge that connects to two uncovered parts of the graph
  • After getting stuck, return to a vertex with uncovered edges and start a new tour
    • Splice the original tour with the new tour to create an Euler tour
  • Linear time complexity; each edge is visited once in either the original or a offshoot tour

Traveling Salesman Problem

  • Input: A connected undirected graph with nonzero positive edge lengths
  • Goal: Find a walk that visits each vertex at least once and returns to the start index, minimizing the sum of edge lengths
  • Can also be written as a distance problem
    • Input: A symmetric positive matrix of distances between points that satisfies the triangle inequality
    • Goal: Find a cylcic order of points that uses each point exactly once, minimizing the sum of distances between points
    • These two representations can be converted to each other
  • TSP is NP-hard, and even with unit edge lengths, it is NP-complete
    • Approximation ratio of 2 can be found with MSTs; theoretical best AR is $\frac{123}{122}$, but the best AR found is $\frac{3}{2} - \frac{1}{10^{36}}$
    • These approximations run in polynomial time

2-Approximation

  • Use the graph version so repeated vertices are allowed
  • Find a MST and make a copy of each edge such that all degrees are now even
  • Take an Euler tour of this graph and return it as the approximate TSP tour

Image

3/2-Approximation

  • Construct an MST
  • Identify the odd vertices in the MST
    • The number of odd vertices will be even by the handshaking lemma
  • Connect them in pairs using the shortest paths from the original graph, choosing the pairings that have the minimal sum of path lengths
  • Combine the pair paths and the MST edges into one graph; return the Euler tour as the TSP tour

Exact DP Algorithm

  • Basic algorithm takes O(n!) time (try all permutations)
  • Dynamic programming algorithm takes $O(2^nn^2)$ time
    • Preprocess the graph to get the all pairs shortest paths distance matrix
    • Choose some starting vertex s
    • Compute, for every subset of A of remaining vertices, and every vertex in A, the best path from s to v covering all vertices in A
    • Choose v such that the path through all vertices to v + one more path back from v towards s has the minimum total length

Algorithm to compute the third step:

L[A, v] is a table containing the length of the shortest tour starting at s, ending at v, covering all vertices in A
V(G) is the set of all vertices

for subset A in 1, 2, ..., 2^(n-1) - 1:
    for v in A:
        if A - v is empty:
            L[A, v] = D[n - 1, v]
        else:
            L[A, v] = min{L[A - v, u] + D[u, v] for u in A - v}

return min{L[V(G) - s, v] + D[v, n - 1] for v in V(G) - s}

Cliques and Coloring

Centrality

  • Centrality: A measure of how central (AKA how well-connected) a node is
    • There exist various measures of centrality, such as degree and pagerank
  • Closeness centrality: $\frac{1}{\text{average distance to other vertices}}$
    • Closeness centrality can be calculated by computing the all-pairs shortest paths and averaging the distances; O(mn)
    • Approximate algorithm exists where k vertices are chosen and we calculate the distances to these k vertices in O(km)
      • Graphs with the small world property (all distances ≤ D) leads to lower variance; setting k to $D \log n / \varepsilon^2$ gives a $(1+\varepsilon)$-approximation in time $O(\frac{m}{\varepsilon^2} \log n)$
  • Betweenness centrality attempts to find a vertex that is on many shortest paths
    • Formula: $\sum_{u, w\neq v} \frac{\text{No. of shortest paths from u to w through v}}{\text{No. of shortest paths from u to w}}$
    • Optimal algorithm takes O(mn) time

Cores

  • A k-core of a graph is the largest subgraph in which all vertices have a degree ≥ k
    • Can be found by deleting vertices of degree < k until there are none left

Algorithm for computing all cores simultaneously:

def cores(G):
    core = dict()
    current_core = 0
    while G is non-empty:
        v = a minimum-degree vertex
        current_core = max(current_core, degree(v))
        core[v] = current_core
        delete V from G
    return core
  • Can be made efficient by decorating each vertex with the number of uncovered neighbors (intially its degree) and updating + maintaining a dictionary with degrees as keys and lists of unprocessed nodes as values
  • Takes O(sum of all degrees) = O(m) time

Degeneracy

  • Many definitions; one is the largest k for which a nonempty k-core exists

Clique

  • A clique is a subset of a graph for which all pairs are connected by an edge
    • Maximum clique is the biggest clique in the graph; a maximal clique is a clique to which no more vertices can be added while maintaining the clique property
    • Independent set problem and clique problem are equivalent; add edges between vertices that are unconnected, and remove existing edges
      • NP-hard to find the maximum clique
  • There exist various algorithms to find all maximal cliques
  • The Moon-Moser Theorem states that the maximum number of maximal cliques is $\leq 3^{n/3}$

Bron-Kerbosch Algorithm

  • Uses recursive backtracking with three sets
    • R is the set of vertices containing the current clique being built
    • P is the set of potential new vertices (that are adjacent to ones in R)
    • X is the set of vertices that have already been added to a clique and should not be explored
  • Optimizations can be made
    • Choose a pivot $p \in P \cup X$ with the smallest number of non-neighbors in P and only loop over these neighbors
      • p’s neighbors will also be in a found clique, so the only vertices that can be added to R are its non-neighbors
    • Use the reverse degeneracy ordering so we don’t have to loop over so many vertices
  • Overall run time is $O(dn3^{d/3})$

Graph Coloring

  • Input: A graph
  • Output: A coloring of vertices in the graph such that no neighboring vertices have the same color
  • Four color theorem: If a planar graph can be drawn without crossing edges, then it can be colored with at most four colors
  • Two color graphs are bipartite and are easy to test for, but testing for a 3-coloring is NP-complete

Greedy Algorithm

  1. Order the vertices
  2. Number the available colors (1, 2, 3, …)
  3. For each vertex in the ordering, give it the lowest-numbered color that is not used by one (or more) of its neighbor
    • O(degree) per node, O(m) total runtime
    • If vertices are ordered in degeneracy ordering, then the greedy algorithm will use at most d + 1 colors
def first_unused(v):
    used = { color(w) for w in neighbors of v }
    for c in 1, 2, ... 
        if c not in used: return c
  • This algorithm is optimal for interval graphs (using a left-to-right ordering)

Register Allocation and Strahler Numbers

  • Problem: Calculate an expression using the minimum number of registers at any given time
  • If two expressions A and B require a and b registers respectively, then we will need to store the answer to one of them while doing the other
    • If A is calculated first, then max(a, b+1) registers are use; otherwise, max(b, a+1) registers are used
    • Calculate intensive expressions first to save the +1 constant
  • Bottom-up calculation of optimal ordering
    • If a node x is a leaf, then the number of calculations R(x) is 1
    • If x has two children y and z, then evaulate the one with the larger subtree first, so R(x) = max(R(y), R(z))
    • If R(y) = R(z), then R(x) = R(y) + 1 = R(z) + 1
  • This bottom-up calculation also produces the Strahler numbers of nodes in a tree
  • This problem can be written as a graph coloring problem
    • Vertices are the local variables of the intermediate code
    • Edges are local variables that need to be stored at the same time
    • Colors are available registers

Chordal Graphs

  • A graph is a chordal graph if and only if one of the following conditions are satisfied:
    1. Every cycle longer than three vertices has a chord, or another edge connecting two vertices of the cycle
    2. It has an elimination ordering: An ordering of the vertices where the later neighbors of every vertex form a clique
    3. For every two vertices x and y and every minimal subset S whose deletion would separate x from y, S is a clique
  • Definition 1 is analagous to a graph with no holes, where a hole is a cycle of at least length 4 that does not have a chord

Elimination Ordering Algorithm (Lexicographic BFS)

  • If G has an elimination ordering, so does G - v
  • Slow, basic algorithm: find vertices whose neighbors form a clique and delete it
  • We can check validity of an elimination order in linear time; for each vertex v, check that all later neighbors of v are neighbors of w (the neighbor of v that is next in the ordering)
    • This can be done using BFS
    • Lexicographic ordering: Given an ordering of vertices, define $pred_i(v)$ to be the ith neighbor of v or v itself if there are fewer than i neighbors earlier than v; order these vertices based on the tuple ${pred_1(v), pred_2(v), …, pred_n(v)}$
  • Algorithm for finding lexicographic order
    • Maintain a sequence of sets of unprocessed vertices, initially in one big set
    • While this sequence is nonempty:
      1. Let $S_i$ be the first set in the sequence and v be any vertex in this set
      2. Remove v from $S_i$ (and remove $S_i$ if it becomes empty)
      3. Output v; let N(v) be the set of neighbors of v
      4. For each set $S_j$ in the sequence, split $S_j$ into two sets $S_j \cap N(v)$ and $S_j \backslash N(v)$
  • We can use dynamic sets, linked lists, and dictionaries to get $O(\vert N(v)\vert)$ time

Perfect Graphs

  • A graph is perfect if the optimal number of colors equals the number of vertices in the largest clique
    • Deleting any vertex in this graph will produce another perfect graph
      • This property is true for interval, chordal, and bipartite graphs
  • Strong Perfect Graph Theorem: A graph G is perfect if and only if it has no odd-length holes AND its complement has no odd-length holes
    • Implies the “weak perfect graph theorem” that complement of perfect graphs are perfect
  • Perfect graphs can be found in polynomial time ($O(n^9)$)
  • An optimal method for coloring and finding cliques in perfect graphs takes polynomial time but is complex
  • Interval graphs $\in$ chordal graphs $\in$ perfect graphs

Interval Graphs and Path Decomposition

  • An interval graph is defined as a graph with nodes representing intervals and edges representing overlaps of two intervals
  • We can rewrite an interval graph using bags where each interval segment is labeled by the intervals that overlap during it
    • Get rid of bags that are not subsets of larger bags
    • The intervals can be recovered from the path decomposition
    • This is known as the interval graph path decomposition

Image

  • Properties of the interval graph path decomposition
    • Each vertex belongs to a consecutive sequence of bags
    • Edges are pairs of vertices that both appear in the same bag
    • The largest clique is the biggest bag
    • Each edge has both its endpoints in some bag
  • Path decomposition: get rid of the intervals and use a sequence of bags with the following properties
    • Each vertex appears in a consecutive subsequence of bags
    • Every edge has ≥ 1 bag containing both endpoints
  • One graph can have many different path decompositions
    • The width of a decomposition is the size of the biggest bag minus 1
    • The pathwidth of a graph is the smallest width of any decomposition
    • The pathwidth of a graph can also be defined as the smallest possible max clique size (minus one) of an interval graph containing G
  • Vertices of a clique have overlapping subsequences of bags
    • Some bag contains the whole clique

Algorithms

Fast clique finding with a low-width path decomposition:

for each bag in the path decomposition:
    for each subset of the bag:
        if it's a clique bigger than the best found so far
            remember it as max clique
  • Outer loop is O(n), inner loop is $O(2^w)$, so $O(n2^w)$ time

Fast coloring:

  • We can use a dynamic programming algorithm to color a graph fast
for each bag B in left-to-right order, and each coloring C with ≤ k colors:
    let B' be the bag just before B
    Good(B, C) = true if C is compatible with a good coloring of B'
  • G has a k-coloring if and only if its last bad has a good coloring
  • Time takes $O(nk^{w+1})$

Trees and Pathwidths

  • Trees are easy to find cliques and easy to color but can have large pathwidth - must be a way to decompose them
  • Tree decomposition involves making a tree of bags instead of a sequence of bags
    • Every vertex must belong to a connected subtree of bags
    • Every edge must have both endpoints in at least one bag
  • Treewidth: the smallest width of any tree decomposition, or the smallest possible max clique size (minus one) of a chordal graph that contains G
    • The tree decomposition can be used in the algorithms that use path decompositions but will work better because more graphs have smaller treewidth

FPT

  • Fixed parameter tractable: Time is polynomial n with a fixed exponent, but may depend badly on width
  • Courcelle’s theorem: There exists an FPT algorithm for any existence or min-weight/max-weight optimization problem on classes of graphs that can be described as a logical formula involving vertices, edges, sets of vertices and edges, and adjacency

Flow

  • Flow can be used in many applications; playoff elimination, traffic engineering, etc.
  • Maximum flow is the question of determining the maximum flow out of a starting vertex s into an ending vertex t
    • The flow into and out of a vertex should be equal (aside from the start and end vertices)
  • Properties of maximum flow
    • Can be converted to a linear program and solved; polynomial time
    • A maximum flow will always exist, even with irrational numbers
    • Integrality: If all the capacities are integers, then there exists a maximum flow in which all flow amounts are integers
  • The max flow problem is equivalent to the minimum cut problem
    • Min-cut: find the minimum sum of capacities flowing from one subset of the graph to another, where subsets are found by “cutting” the graph in the plane
    • For every flow network, the maximum flow is equal to the minimum cut
    • For every flow, either there is an equal cut or we can increase the flow; max flow ≥ min cut

Image

Greedy Algorithm

  • Start with a flow that is not maximum (all-zero flow)
  • If we ever get stuck, we can find a cut with a capacity equal to the flow amount
  • Since max flow ≤ min cut ≤ this new cut, and our flow equals our cut, then we have found the maximum flow

Augmenting Path

  • Residual graph: A graph with backwards edges containing the capacities of the edge in the current graph minus the flow of the incoming vertex
  • While the residual graph has a path $P$ from $s$ to $t$:
    • Find the width $w$ of $P$
    • Increase flow along $P$ by $w$ units
    • Recompute the residual graph
  • When the algorithm terminates, it finds a minimum cut (and a maximum flow)
    • Stops when there are no residual paths from s to t
    • Form a cut of the graph into two subsets $S$ and $T$
      • $S$ consists of everything that can be reached from $s$ by a residual path; $T$ is everything else

Matching

  • Bipartite graph: A graph where each edge has a red and blue vertex for its endpoints
    • Can test for bipartiteness in linear time
  • Matching: A set of edges that don’t touch each other (no two matched edges share a vertex)
  • Independent set: A set of vertices that don’t touch each other (no two vertices share an edge)
  • Konig’s theorem: Let $M$ be the number of edges in a maximum matching and $I$ be the number of vertices in the maximum independent set; in bipartite graphs, $M+I=n$
  • A bipartite maximum matching can be written as a flow problem
    • Add vertices s connected to all red vertices and t connected to all blue vertices
    • Direct edges $s \rightarrow red \rightarrow blue \rightarrow t$ with capacity 1
    • Find the integer maximum flow
    • If the found flow is not maximum, then the augmented path will be the maximum flow; guaranteed to find the maximum matching
    • Time bounds: $O(mn)$ for basic algorithm, $O(m\sqrt{n})$ for Hopcroft-Karp-Karzanov, $O(m^{1+\varepsilon})$ for any $\varepsilon > 0$ (Chen)

Complete Bipartite Matching

  • Complete bipartite graph: All edges that go from one independent set to another are included
    • $K_{a,b}$ refers to a complete bipartite graph with $a$ vertices on one side, $b$ on the other
    • Balanced means that the graph has the same number on each side: $K_{n/2,n/2}$
  • Perfect matching: A matching that matches every vertex
    • In a balanced complete bipartite graph, every matching can be turned into a perfect matching
    • Matchings can be described a permutations
  • Assignment problem: Find a minimum-weight maximum matching in a weighted bipartite graph
    • In many applications, the graph is complete bipartite and maximum matchings are perfect matchings
    • Can turn the question into: Find a minimum-weight perfect matching in a weighted bipartite graph
  • Hungarian algorithm: Solves assignment problem
    • Simple steps: Start from an empty matching, then find a minimum-weight alternating path $n/2$ times
    • The weight of an alternating path is how much it increases the weight of a matching
    • Complex steps: Adjust the weight (original weight - heights of both endpoints) of each edge such that they’re all non-negative, use Dijkstra’s to find the shortest alternating paths
    • Runtime: $O(nm + n^2)$ (run Dijkstra’s n/2 times)

Full algorithm:

  • Initialize heights to make adjusted weights ≥ 0

  • Repeat n/2 times:

    • Add an artificial start vertex s, with edges of weight 0 to all unmatched red vertices, direct all unmatched edges red-to-blue and all matched edges blue-to-red
    • Use Dijkstra’s to find the adjusted distances from s to all other vertices, including the shortest alternating path
    • Adjust heights: subtract distance at red vertices, add distance at blue
    • Use the shortest alternating path to increase the size of matching