Graph Algorithms

Graph Representation
- Decorator Pattern and Adjacency Lists
BFS/DFS
Minimum Spanning Trees
Topological Ordering
Shortest and Longest Paths
Widest Paths
Euler Tours
- Fleury’s Algortihm
- Hierholzer’s Algorithm
Traveling Salesman Problem
Cliques and Coloring
Graph Coloring
- Greedy Algorithm
- Register Allocation and Strahler Numbers
Chordal Graphs
- Elimination Ordering Algorithm (Lexicographic BFS)
Perfect Graphs
Interval Graphs and Path Decomposition
- Algorithms
- Trees and Pathwidths
FPT
Flow
- Greedy Algorithm
  - Augmenting Path
- Matching
  - Complete Bipartite Matching
  - Stable Matching
Planar Graphs
- Planarity Testing
- Planarity Drawnig

Graph Representation

Want the following out of a graph data structure: - Count vertices: O(1) - Iterate through all vertices: O(1) per vertex - Associate information with vertices: O(1) per get/set - Iterate through all edges: O(1) per edge - Get degree of vertex: O(1) - Count all edges: O(1) - Iterate through edges in/out of a vertex: O(1) per edge - Associate information with edges: O(1) per get/set - Test whether two vertices are connected by an edge: O(1)

Decorator Pattern and Adjacency Lists

Possible solutions:

Dictionary, with vertices as keys
Number vertices and index an array using those numbers
Separate variables for vertex objects for each piece of information
One dictionary variable per vertex with keys for each piece of information

An adjacency list uses a dictionary of vertices as keys and the list of outgoing neighbors as values
- There are lots of different variations on this structure; different objects for vertices and edges
The Python-style dictionary representation satisfies the time constraints needed for an efficient PageRank algorithm
- Some operations are not supported; e.g. looping through incoming edges can be slow

BFS/DFS

Implicit graph: A graph whose edges and vertices are not already given and must be found through an algorithm (like BFS)
A path is an alternating sequence of endpoints and edges that does not allow repetition
A walk is the same as a path but allows repetition
Reachability: Does there exist a path between two nodes? If so, then they are reachable from each other

Minimum Spanning Trees

Given an undirected graph, find a tree that connects all vertices and minimizes the total weight
Disconnected graphs will output a spanning forest as opposed to a tree
Minimize and maximize are the same problem; simply make weights negative

Applications

Build an optimal power grid/transportation routes with minimal total construction cost
Build a computer network with maximum badwidth - maximum spanning tree
The minimum spanning tree has the smallest possible bottle neck out of all trees; maximum has the largest

Properties

Assumes that there are no “tied” edges, and if they do have the same edge, then use a consistent tiebreaking method
Cycle property: If C is a cycle in the original graph G, then the heaviest edge in this cycle cannot be in the MST
- In other words: If C is any cycle and e is its heaviest edge, then an MST that contains e will not be the MST
Cut property: If we cut the vertices of the graph into any two subsets of vertices X and G - X, and e is the lihtest edge with endpoints in both subsets, then e must be in the MST
- Can be used to derive algorithms
Path property: For any two vertices x and y, the MST path from x to y has the minimum possible weight for its heaviest edge

Algorithms

Most algorithms to find an MST takes O(n log n) time and utilize the cut property
Algorithms have been found that take close to linear time or expected linear time but randomized

Jarnik’s Algorithm (Prim-Dijkstra)

Start with some vertex S; this will be the start of the MST, named T
Repeat until T contains all vertices:
1. Partition the graph into T and G - T
2. Find the minimum-weight edge e connecting T and G - T
3. Add e to T (as well as its corresponding vertex)
Can be implemented via a priority queue
- Store vertices not yet in T by the weight of the minimum-weight edge that would connect it to T
- Update the priority queue each time you update T by looping through the edges
Using a binary heap, all find/remove operations take O(log n) time, so the total time is O(m log n) because each edge is looped over twice
A Fibonacci heap can get this down to O(m + n log n)

Kruskal’s Algorithm

Start by creating a forest of one-vertex trees
Sort the edges by weight
For each edge, if it connects two different trees, conenct the trees and add the edge to the MST
O(m log m) for sorting, O(m) to iterate through list; O(m log m) total
Can use the union-find data structure for near-constant lookups of forests

Boruvka’s Algorithm (Sollin)

Start by creating a forest of one-vertex trees
Label vertices by their tree
While there is more than one tree, find the minimum-weight outgoing edge from each forest
- Add all of these minimum-weight edges to the T and connect the forests using them
The number of trees will go down by a factor of 2 or more, so we will do this iteration log n times
Looping through all m edges per iteration means the runtime is O(m log n)

Hybrid Algorithm

Run Boruvka’s algorithm until the number of trees is less than or equal to n / log n
Run Jarnik’s algorithm with Fiboniacci heaps, using a priority queue on the trees as opposed to the vertices
O(m log log n) from Boruvka and O(m) from Jarnik means that the total runtime is O(m log log n)

Topological Ordering

Topological ordering: A list of all vertices in G such that, for every edge $x \rightarrow y$, x is before y in the list
Topological sorting: An algorithm to compute a topological ordering
Graphs that have cycles cannot have a topological ordering
- Thus, only directed acyclic graphs can have a topological ordering
Can use a simple greedy algorithm; add vertices with no incoming edges to the ordering, and remove its outgoing edges from the graph
- If there is ever a point where there are no “ready” vertices but there are still vertices left to process, then there is a cycle
- To find the cycle: go to a remaining vertex and walk forward until a vertex is repeated; the sequence between the repititions is the cycle

Shortest and Longest Paths

Shortest path can found by BFS; longest path is NP-hard because it includes finding a Hamiltonian path
- When there exists negative paths, then shortest path is NP-hard
Shortest walk is not possible if the graph has a negative cycle; can keep walking around the cycle and reducing total weight
For finding shortest paths in directed graphs:
- If graph is DAG, use topological ordering (O(m))
- If graph has nonnegative edge lengths, use Dijkstra’s (close to linear time)
  - Works on undirected graphs
- If graph has negative edges but no negative cycle, use Bellman-Ford algorithm (O(mn))
  - Can also detect negative cycles if one exists
- Bellman-Ford is optimal and can be improved based on recent results
- Dijkstra’s is “universally optimal” for a fixed graph with variable, positive weights

Critical Path Scheduling (PERT method)

Input: a project consisting of multiple tasks that take different amounts of time
Some tasks can be done simultaneously; others rely on earlier tasks to be completed
Goal: Find a schedule for the tasks that finishes the project in the lowest possible time
- AKA longest path finder
Can use an activity-on-edge graph; tasks are converted to edges, and vertices are project milestones
- Constraints are represented by directed edges and have zero weight
- The critical path is the longest path in this graph from start to end and is the minimum possible time for the schedule
- Assign times to vertices by making sure that the difference between stages is greater than or equal to the length of the edge between them
To calculate the optimal schedule, we can use a topological ordering
- Set the path length of the starting vertex to 0
- Find the largest incoming edge $u\rightarrow v$ and set the path length of v to $L(u) + weight(u \rightarrow v)$
Can find the smallest incoming edge to get the shortest paths

Shortest Path Trees and Relaxation Algorithms

In graphs without negative cycles, we can generate a tree of shortest paths
If s is the starting vertex, then other vertices will have parents that are the vertices that come before them in their shortest path
- e.g. if the shortest path from x to c is $x \rightarrow b \rightarrow c$, then b is a parent of c in the tree
Input: A graph with edge lengths and a starting vertex
Output: Tree of shortest paths from s to all other reachable vertices
- Two decorations per vertex: the parent and distance from start vertex
General idea: use the two decorations and update them when shorter paths are found
Invariants
- D is the length of some path to x that is greater than or equal to the correct value
- P is the second-to-last vertex on a path of length ≤D
Initialize by setting P(x) = None and D(x) = 0 if the vertex is s; otherwise D(x) = infinity
For each edge uv, check whether the distance to u + the length of uv is less than the current shortest path for v, then update if so
- Known as relaxing an edge

def relax(u, v):
    if D[u] + length(edge uv) < D[v]:
        D[v] = D[u] + length(edge uv)
        P[v] = u

If the graph is a DAG:
- Simple algorithm is to go through vertices in topological order and relaxing each vertices’ edges
- Total time is O(m)

initialize D, P
for v in topological order:
    for incoming edges uv:
        relax(u, v)

Bellman-Ford Algorithm

Uses relaxation paradigm and takes O(mn) time

initialize D, P
repeat n-1 times:
    for each edge uv in the whole graph:
        relax(u, v)

Can be improved based on ordering of nodes or stopping early; still results in same big O

Dijkstra’s Algorithm

Topologically sorts the shortest path tree by sorting vertices by distance
- Uses a priority queue to order on the fly

initialize D, P
make priority queue Q of vertices, prioritized by D[v]
while Q is non-empty:
    find and remove minimum-priority vertex v in Q
    for each edge vw:
        relax(vw)

Takes O(m log n) with a binary heap or O(m + n log n) with a Fibonacci heap

Changing Weights without Changing Optimum Paths

We can add another decoration to the edges, a height h(x), to reweight the edges
Each edge has two properties: length and height
- New length: $l_h(u\rightarrow v) = l(u\rightarrow v) - h(u) + h(v)$
- Downhill edges have decreased length, uphill edges have increased length
Heights can eliminate negative edges and allow Dijkstra’s to work
- Can now use Johnson’s and Suurballe’s algorithm
- Won’t get rid of negative cycles
Can also change the order of computation
- A* algorithm uses this fact

Johnson’s Algorithm

Input: A graph G with negative edges but not negative cycles
Output: Shortest paths between all pairs of vertices

Add a start vertex s that is connected to all other vertices by zero-weight edges
Use Bellman-Ford only once to compute the shortest paths from s to all other vertices
Reweight using the absolute value of distance from s as the height (guarantees that all edges are non-negative)
Use Dijkstra from each starting vertex in the reweighted graph

BF takes O(mn), reweighting takes linear time, Dijkstra is run n times; $O(mn + n^2 \text{log } n)$ time

Suurballe’s Algorithm

Input: A weighted directed graph and two endpoint vertices, s and t
Output: Find two paths from s to t such that the paths don’t share edges and minimize the sum of the length of both
Can find two shortest paths by finding one shortest path, then adding “reverse” edges with negative weights from the first shortest path
- If both shortest paths take the same edge, then the edge will be cancelled out

Use Dijkstra to find single-source paths and distances from s
Reweight the edges if they belong to a shortest path to equal 0
Find shortest path P from s to t and reverse its edges
Find a new shortest path in the modified graph
Remove any edges visited in both paths and reconnect the paths

Two runs of Dijkstra, linear time to reweight and reverse; total runtime of O(m + log n)

A*

Want to traverse paths to destination with Dijkstra without going down bad paths
Idea: choose a height function for each vertex v that is easily to calculate and accurately estimates the distance from v to t
- Must be an underestimate so paths to t eventually go downhill
- Can’t decrease too quickly; height between two vertices differs by at most the length of the edge between them
- AKA needs to be admissible
Reweight the graph by height function
- Edges will be non-negative because of our height function
- Edges with smaller estimates appear shorter
Run Dijkstra on this reweighted graph
- Same runtime as Dijkstra, but with linear time added to reweight

Widest Paths

Input: A graph with weighted edges
Output: Path from start to end with the maximum minimum weight (i.e. the path with the highest throughput, as given by the minimum edge weight on the path)

Linear Algorithm

For undirected graphs: if there is a single source and a single destination, the path can be found in linear time
1. Find median edge weight
2. If a path can be made using heavier-than-median edges, get rid of lighter edges and recurse
3. If a path cannot be made in this fashion, collapse connected components of wide edges into “supervertices” and recurse; combine edges
Linear time because $\Sigma \frac{m}{2^{i+1}} = O(m)$
For directed graphs: we can use an algorithm that divides edges into subsets and runs linear time Dijkstra on these subsets in order to get a runtime of O(m log^*n)

Dijkstra

We can modify the relax algorithm to prioritize widths of paths

def relax(u, v):
    if v in Q and min(D[u], length(uv)) > D[v]:
        D[v] = min(D[u], length(uv))
        P[v] = u

Same runtime as regular Dijkstra’s: O(m + n log n)
- If edges are already sorted, then this algorithm runs in linear time using an integer priority queue

All-Pairs Directed Widest Paths

Sort all the edges before finding paths (O(m log m))
Run linear Dijkstra on each vertex (O(mn))
Total runtime: O(mn + m log m)

Can also use fast matrix multiplication if the number of edges is large in $O(n^{2.688})$ time

Applications: Voting Systems

Instant runoff voting: Eliminate the candidate with the fewest amount of voters who have them as their favorite until a candidate has a majority
Borda count: With c candidates, give candidates points based on their positioning in a voter’s preference ordering (first = c-1, second = c-2, last = 0); candidate with the most points wins
Desirable properties
- Majority: Candidate with a majority of first-place votes wins
- Weak majority: Candidate with all first-place votes wins
- Symmetry: All voters are treated equally
- Weak Symmetry: There is no super-voter whose preference decides the election
- No spoilers: If candidate X does not win the election, then removing X does not change the outcome (think third-parties)
Impossible to satisfy weak symmetry, weak majority, and no spoilers

Condorcet Property / Schulze Method

Condorcet Property: If there is a candidate X who wins every head-to-head comparison, then X should win the election
- If a cycle exists, then there is no apparent winner (i.e. A beats B, B beats C, C beats A)
Generate a graph where an edge from candidate U to V is weighted with the number of voters who prefer U to V
Relabel each edge X to Y with the width of the widest path from X to Y in the original graph
There is guaranteed to be a winner in this graph
With randomized quickselect, this method takes $O(candidates^2 \cdot \log candidates)$

Euler Tours

Defined as a walk through a graph such that each edge is visited once
Note that, for any finite graph, there should be an even number of odd-degree vertices, as the sum of degrees should be even
- A graph with no odd vertices has an Euler tour that starts and ends at the same vertex
- A graph with two odd vertices has an Euler tour that starts at one of the odd vertices and ends at the other

Fleury’s Algortihm

Greedy algorithm: Keep adding edges without adding an edge that connects two uncovered parts of the graph
Basic implementation is $O(m^2)$; check all edges when adding each edge
Advanced implementations have better times, but still not linear

Hierholzer’s Algorithm

Keep adding edges without checking whether or not we add an edge that connects to two uncovered parts of the graph
After getting stuck, return to a vertex with uncovered edges and start a new tour
- Splice the original tour with the new tour to create an Euler tour
Linear time complexity; each edge is visited once in either the original or a offshoot tour

Traveling Salesman Problem

Input: A connected undirected graph with nonzero positive edge lengths
Goal: Find a walk that visits each vertex at least once and returns to the start index, minimizing the sum of edge lengths
Can also be written as a distance problem
- Input: A symmetric positive matrix of distances between points that satisfies the triangle inequality
- Goal: Find a cylcic order of points that uses each point exactly once, minimizing the sum of distances between points
- These two representations can be converted to each other
TSP is NP-hard, and even with unit edge lengths, it is NP-complete
- Approximation ratio of 2 can be found with MSTs; theoretical best AR is $\frac{123}{122}$, but the best AR found is $\frac{3}{2} - \frac{1}{10^{36}}$
- These approximations run in polynomial time

2-Approximation

Use the graph version so repeated vertices are allowed
Find a MST and make a copy of each edge such that all degrees are now even
Take an Euler tour of this graph and return it as the approximate TSP tour

3/2-Approximation

Construct an MST
Identify the odd vertices in the MST
- The number of odd vertices will be even by the handshaking lemma
Connect them in pairs using the shortest paths from the original graph, choosing the pairings that have the minimal sum of path lengths
Combine the pair paths and the MST edges into one graph; return the Euler tour as the TSP tour

Exact DP Algorithm

Basic algorithm takes O(n!) time (try all permutations)
Dynamic programming algorithm takes $O(2^nn^2)$ time
- Preprocess the graph to get the all pairs shortest paths distance matrix
- Choose some starting vertex s
- Compute, for every subset of A of remaining vertices, and every vertex in A, the best path from s to v covering all vertices in A
- Choose v such that the path through all vertices to v + one more path back from v towards s has the minimum total length

Algorithm to compute the third step:

L[A, v] is a table containing the length of the shortest tour starting at s, ending at v, covering all vertices in A
V(G) is the set of all vertices

for subset A in 1, 2, ..., 2^(n-1) - 1:
    for v in A:
        if A - v is empty:
            L[A, v] = D[n - 1, v]
        else:
            L[A, v] = min{L[A - v, u] + D[u, v] for u in A - v}

return min{L[V(G) - s, v] + D[v, n - 1] for v in V(G) - s}

Cliques and Coloring

Centrality

Centrality: A measure of how central (AKA how well-connected) a node is
- There exist various measures of centrality, such as degree and pagerank
Closeness centrality: $\frac{1}{\text{average distance to other vertices}}$
- Closeness centrality can be calculated by computing the all-pairs shortest paths and averaging the distances; O(mn)
- Approximate algorithm exists where k vertices are chosen and we calculate the distances to these k vertices in O(km)
  - Graphs with the small world property (all distances ≤ D) leads to lower variance; setting k to $D \log n / \varepsilon^2$ gives a $(1+\varepsilon)$-approximation in time $O(\frac{m}{\varepsilon^2} \log n)$
Betweenness centrality attempts to find a vertex that is on many shortest paths
- Formula: $\sum_{u, w\neq v} \frac{\text{No. of shortest paths from u to w through v}}{\text{No. of shortest paths from u to w}}$
- Optimal algorithm takes O(mn) time

Cores

A k-core of a graph is the largest subgraph in which all vertices have a degree ≥ k
- Can be found by deleting vertices of degree < k until there are none left

Algorithm for computing all cores simultaneously:

def cores(G):
    core = dict()
    current_core = 0
    while G is non-empty:
        v = a minimum-degree vertex
        current_core = max(current_core, degree(v))
        core[v] = current_core
        delete V from G
    return core

Can be made efficient by decorating each vertex with the number of uncovered neighbors (intially its degree) and updating + maintaining a dictionary with degrees as keys and lists of unprocessed nodes as values
Takes O(sum of all degrees) = O(m) time

Degeneracy

Many definitions; one is the largest k for which a nonempty k-core exists

Clique

A clique is a subset of a graph for which all pairs are connected by an edge
- Maximum clique is the biggest clique in the graph; a maximal clique is a clique to which no more vertices can be added while maintaining the clique property
- Independent set problem and clique problem are equivalent; add edges between vertices that are unconnected, and remove existing edges
  - NP-hard to find the maximum clique
There exist various algorithms to find all maximal cliques
The Moon-Moser Theorem states that the maximum number of maximal cliques is $\leq 3^{n/3}$

Bron-Kerbosch Algorithm

Uses recursive backtracking with three sets
- R is the set of vertices containing the current clique being built
- P is the set of potential new vertices (that are adjacent to ones in R)
- X is the set of vertices that have already been added to a clique and should not be explored
Optimizations can be made
- Choose a pivot $p \in P \cup X$ with the smallest number of non-neighbors in P and only loop over these neighbors
  - p’s neighbors will also be in a found clique, so the only vertices that can be added to R are its non-neighbors
- Use the reverse degeneracy ordering so we don’t have to loop over so many vertices
Overall run time is $O(dn3^{d/3})$

Graph Coloring

Input: A graph
Output: A coloring of vertices in the graph such that no neighboring vertices have the same color
Four color theorem: If a planar graph can be drawn without crossing edges, then it can be colored with at most four colors
Two color graphs are bipartite and are easy to test for, but testing for a 3-coloring is NP-complete

Greedy Algorithm

Order the vertices
Number the available colors (1, 2, 3, …)
For each vertex in the ordering, give it the lowest-numbered color that is not used by one (or more) of its neighbor
- O(degree) per node, O(m) total runtime
- If vertices are ordered in degeneracy ordering, then the greedy algorithm will use at most d + 1 colors

def first_unused(v):
    used = { color(w) for w in neighbors of v }
    for c in 1, 2, ... 
        if c not in used: return c

This algorithm is optimal for interval graphs (using a left-to-right ordering)

Register Allocation and Strahler Numbers

Problem: Calculate an expression using the minimum number of registers at any given time
If two expressions A and B require a and b registers respectively, then we will need to store the answer to one of them while doing the other
- If A is calculated first, then max(a, b+1) registers are use; otherwise, max(b, a+1) registers are used
- Calculate intensive expressions first to save the +1 constant
Bottom-up calculation of optimal ordering
- If a node x is a leaf, then the number of calculations R(x) is 1
- If x has two children y and z, then evaulate the one with the larger subtree first, so R(x) = max(R(y), R(z))
- If R(y) = R(z), then R(x) = R(y) + 1 = R(z) + 1
This bottom-up calculation also produces the Strahler numbers of nodes in a tree
This problem can be written as a graph coloring problem
- Vertices are the local variables of the intermediate code
- Edges are local variables that need to be stored at the same time
- Colors are available registers

Chordal Graphs

A graph is a chordal graph if and only if one of the following conditions are satisfied:
1. Every cycle longer than three vertices has a chord, or another edge connecting two vertices of the cycle
2. It has an elimination ordering: An ordering of the vertices where the later neighbors of every vertex form a clique
3. For every two vertices x and y and every minimal subset S whose deletion would separate x from y, S is a clique
Definition 1 is analagous to a graph with no holes, where a hole is a cycle of at least length 4 that does not have a chord

Elimination Ordering Algorithm (Lexicographic BFS)

If G has an elimination ordering, so does G - v
Slow, basic algorithm: find vertices whose neighbors form a clique and delete it
We can check validity of an elimination order in linear time; for each vertex v, check that all later neighbors of v are neighbors of w (the neighbor of v that is next in the ordering)
- This can be done using BFS
- Lexicographic ordering: Given an ordering of vertices, define $pred_i(v)$ to be the ith neighbor of v or v itself if there are fewer than i neighbors earlier than v; order these vertices based on the tuple ${pred_1(v), pred_2(v), …, pred_n(v)}$
Algorithm for finding lexicographic order
- Maintain a sequence of sets of unprocessed vertices, initially in one big set
- While this sequence is nonempty:
  1. Let $S_i$ be the first set in the sequence and v be any vertex in this set
  2. Remove v from $S_i$ (and remove $S_i$ if it becomes empty)
  3. Output v; let N(v) be the set of neighbors of v
  4. For each set $S_j$ in the sequence, split $S_j$ into two sets $S_j \cap N(v)$ and $S_j \backslash N(v)$
We can use dynamic sets, linked lists, and dictionaries to get $O(\vert N(v)\vert)$ time

Perfect Graphs

A graph is perfect if the optimal number of colors equals the number of vertices in the largest clique
- Deleting any vertex in this graph will produce another perfect graph
  - This property is true for interval, chordal, and bipartite graphs
Strong Perfect Graph Theorem: A graph G is perfect if and only if it has no odd-length holes AND its complement has no odd-length holes
- Implies the “weak perfect graph theorem” that complement of perfect graphs are perfect
Perfect graphs can be found in polynomial time ($O(n^9)$)
An optimal method for coloring and finding cliques in perfect graphs takes polynomial time but is complex
Interval graphs $\in$ chordal graphs $\in$ perfect graphs

Interval Graphs and Path Decomposition

An interval graph is defined as a graph with nodes representing intervals and edges representing overlaps of two intervals
We can rewrite an interval graph using bags where each interval segment is labeled by the intervals that overlap during it
- Get rid of bags that are not subsets of larger bags
- The intervals can be recovered from the path decomposition
- This is known as the interval graph path decomposition

Properties of the interval graph path decomposition
- Each vertex belongs to a consecutive sequence of bags
- Edges are pairs of vertices that both appear in the same bag
- The largest clique is the biggest bag
- Each edge has both its endpoints in some bag
Path decomposition: get rid of the intervals and use a sequence of bags with the following properties
- Each vertex appears in a consecutive subsequence of bags
- Every edge has ≥ 1 bag containing both endpoints
One graph can have many different path decompositions
- The width of a decomposition is the size of the biggest bag minus 1
- The pathwidth of a graph is the smallest width of any decomposition
- The pathwidth of a graph can also be defined as the smallest possible max clique size (minus one) of an interval graph containing G
Vertices of a clique have overlapping subsequences of bags
- Some bag contains the whole clique

Algorithms

Fast clique finding with a low-width path decomposition:

for each bag in the path decomposition:
    for each subset of the bag:
        if it's a clique bigger than the best found so far
            remember it as max clique

Outer loop is O(n), inner loop is $O(2^w)$, so $O(n2^w)$ time

Fast coloring:

We can use a dynamic programming algorithm to color a graph fast

for each bag B in left-to-right order, and each coloring C with ≤ k colors:
    let B' be the bag just before B
    Good(B, C) = true if C is compatible with a good coloring of B'

G has a k-coloring if and only if its last bad has a good coloring
Time takes $O(nk^{w+1})$

Trees and Pathwidths

Trees are easy to find cliques and easy to color but can have large pathwidth - must be a way to decompose them
Tree decomposition involves making a tree of bags instead of a sequence of bags
- Every vertex must belong to a connected subtree of bags
- Every edge must have both endpoints in at least one bag
Treewidth: the smallest width of any tree decomposition, or the smallest possible max clique size (minus one) of a chordal graph that contains G
- The tree decomposition can be used in the algorithms that use path decompositions but will work better because more graphs have smaller treewidth

FPT

Fixed parameter tractable: Time is polynomial n with a fixed exponent, but may depend badly on width
Courcelle’s theorem: There exists an FPT algorithm for any existence or min-weight/max-weight optimization problem on classes of graphs that can be described as a logical formula involving vertices, edges, sets of vertices and edges, and adjacency

Flow

Flow can be used in many applications; playoff elimination, traffic engineering, etc.
Maximum flow is the question of determining the maximum flow out of a starting vertex s into an ending vertex t
- The flow into and out of a vertex should be equal (aside from the start and end vertices)
Properties of maximum flow
- Can be converted to a linear program and solved; polynomial time
- A maximum flow will always exist, even with irrational numbers
- Integrality: If all the capacities are integers, then there exists a maximum flow in which all flow amounts are integers
The max flow problem is equivalent to the minimum cut problem
- Min-cut: find the minimum sum of capacities flowing from one subset of the graph to another, where subsets are found by “cutting” the graph in the plane
- For every flow network, the maximum flow is equal to the minimum cut
- For every flow, either there is an equal cut or we can increase the flow; max flow ≥ min cut

Greedy Algorithm

Start with a flow that is not maximum (all-zero flow)
If we ever get stuck, we can find a cut with a capacity equal to the flow amount
Since max flow ≤ min cut ≤ this new cut, and our flow equals our cut, then we have found the maximum flow

Augmenting Path

Residual graph: A graph with backwards edges containing the capacities of the edge in the current graph minus the flow of the incoming vertex
While the residual graph has a path $P$ from $s$ to $t$:
- Find the width $w$ of $P$
- Increase flow along $P$ by $w$ units
- Recompute the residual graph
When the algorithm terminates, it finds a minimum cut (and a maximum flow)
- Stops when there are no residual paths from s to t
- Form a cut of the graph into two subsets $S$ and $T$
  - $S$ consists of everything that can be reached from $s$ by a residual path; $T$ is everything else

Matching

Bipartite graph: A graph where each edge has a red and blue vertex for its endpoints
- Can test for bipartiteness in linear time
Matching: A set of edges that don’t touch each other (no two matched edges share a vertex)
Independent set: A set of vertices that don’t touch each other (no two vertices share an edge)
Konig’s theorem: Let $M$ be the number of edges in a maximum matching and $I$ be the number of vertices in the maximum independent set; in bipartite graphs, $M+I=n$
A bipartite maximum matching can be written as a flow problem
- Add vertices s connected to all red vertices and t connected to all blue vertices
- Direct edges $s \rightarrow red \rightarrow blue \rightarrow t$ with capacity 1
- Find the integer maximum flow
- If the found flow is not maximum, then the augmented path will be the maximum flow; guaranteed to find the maximum matching
- Time bounds: $O(mn)$ for basic algorithm, $O(m\sqrt{n})$ for Hopcroft-Karp-Karzanov, $O(m^{1+\varepsilon})$ for any $\varepsilon > 0$ (Chen)

Complete Bipartite Matching

Complete bipartite graph: All edges that go from one independent set to another are included
- $K_{a,b}$ refers to a complete bipartite graph with $a$ vertices on one side, $b$ on the other
- Balanced means that the graph has the same number on each side: $K_{n/2,n/2}$
Perfect matching: A matching that matches every vertex
- In a balanced complete bipartite graph, every matching can be turned into a perfect matching
- Matchings can be described a permutations
Assignment problem: Find a minimum-weight maximum matching in a weighted bipartite graph
- In many applications, the graph is complete bipartite and maximum matchings are perfect matchings
- Can turn the question into: Find a minimum-weight perfect matching in a weighted bipartite graph
Hungarian algorithm: Solves assignment problem
- Simple steps: Start from an empty matching, then find a minimum-weight alternating path $n/2$ times
- The weight of an alternating path is how much it increases the weight of a matching
- Complex steps: Adjust the weight (original weight - heights of both endpoints) of each edge such that they’re all non-negative, use Dijkstra’s to find the shortest alternating paths
- Runtime: $O(nm + n^2)$ (run Dijkstra’s n/2 times)

Full algorithm:

Initialize heights to make adjusted weights ≥ 0
Repeat n/2 times:
- Add an artificial start vertex s, with edges of weight 0 to all unmatched red vertices, direct all unmatched edges red-to-blue and all matched edges blue-to-red
- Use Dijkstra’s to find the adjusted distances from s to all other vertices, including the shortest alternating path
- Adjust heights: subtract distance at red vertices, add distance at blue
- Use the shortest alternating path to increase the size of matching

Stable Matching

Input: A balanced complete bipartite graph with edges representing a preference ordering; i.e. each vertex has a preference ordering of vertices on the other side of the bipartition
- In other words: $n$ applicants and $n$ positions, where applicants and positions provide a preference ordering of each other
Output: A stable matching where no pair of vertices would both prefer to be matched to each other
- In other words: there exists no $(x, y)$ such that $x$ would prefer $y$ and $y$ would prefer $x$ as opposed to the current matching
Stable matching always exists and can be found in linear time
Various applications; match med students to residencies, high school students to universities, etc.

# Gale-Shapley algorithm

Start with an empty matching

While there are unfilled positions:
    Choose any set U of unfilled positions
    Each position in U sends an offer to its most preferred applicant (that it hasn't previously sent anything to)
    Each applicant who receives an offer from a more preferred company accepts it

We can get the Gale-Shapley algorithm to run in linear time using decorators, dictionaries, and iterators
- Each offer takes constant time because of our data structures, at most $O(m)$ offers; linear time
This algorithm will always produce the same stable matching
- The stable matching produced will be the best for positions but the worst for applicants (in terms of possible matchings and preference orderings)
- Positions cannot game the system by giving false preference orderings
Changing the constraints of the stable matching (such as with non-bipartite graphs) turns the problem into NP-hard

Planar Graphs

A planar graph is a graph that you can draw in the 2D plane such that there are no crossings
- A famous example is the three utilities problem; this is a nonplanar graph with at least one crossing
Examples of planar graphs: trees, grid graphs, convex polyhedra
Properties of Planar Graphs
- Euler’s formula: A connected planar graph with $V$ vertices, $E$ edges, and $F$ faces satisfies the equation $V-E+F=2$
  - Implies that there are few edges in a graph
  - For simple planar graphs, ${E \leq 3V - 6}$; for bipartite ones, ${E\leq 2V - 4}$
- Simple planar graphs have $\leq 3n-6$ edges and their degeneracy is $\leq 5$
  - This means that $m$ is $O(n)$
- Four-color theorem: Can be colored with $\leq 4$ colors
- Planar separator theorem: Tree width is $O(\sqrt{n})$
- Product structure theorem: Every planar graph is a subgraph of a product $P\boxtimes T$ where $P$ is a path and $T$ has treewidth $O(1)$
Kuratowski’s Theorem: A graph is planar iff it does not have a subgraph formed by $K_{3,3}$ or $K_5$ by creating big edges using existing edges
Duality; create a vertex for each face and an edge between touching faces
- If a subgraph of the primal has a cycle, then the dual will be a disconnected graph
- If the primal is a spanning tree, then the dual will also be a spanning tree
- If the primal graph is Eulerian, then the dual graph will be bipartite
- Acyclic in primal $\iff$ dual is strongly connected
- Shortest path in primal $\iff$ min cut in dual

Planarity Testing

Find the cyclical ordering of edges around each vertex
Determine whether or not the ordering follows Euler’s formula

Planarity Drawnig

According to Fary’s theorem, drawing a planar graph where vertices are given (integer) coordinates and edges are straight lines is always possible
The grid on which the graph is drawn must be at least $n/3 \times n/3$
There are various known upper bounds
- $(2n-3) \times (n-1)$ grid
- $(\frac{4n}{3}) \times (\frac{2n}{3})$ grid
- $(n-1) \times (n-1)$ grid (the one used for class)
A planar graph is maximal if you cannot add any more edges while preserving planarity
- A graph with a face with four or more vertices CANNOT be maximal; add another edge between two of the vertices in this face
  - All faces are triangles if maximal
- The number of edges must be $m = 3n-6$
It never hurts to add extra edges and turn a planar graph into a maximal planar graph; adding these extra edges takes linear time
- Can use a DCEL !
A Schnyder forest is a representation of a graph such that there are three trees rooted at the outer vertices that each contain all of the other vertices of the graph

To create a Schnyder forest:

Find a neighbor of one of the three root vertices that only shares two neighbors with the root
Merge this vertex, and repeat the last two steps until all vertices are merged
Unmerge vertices and color

Trilinear coordinates is a coordinate system of 3-tuples; each element of the tuple represents the distance from each of the three sides
- Can be converted to Cartesian coordinates by dropping the last element of the 3-tuple
To get coordinates of a planar graph, we can create trilinear coordinates for each vertex, where each element of the 3-tuple is the number of vertices in the red/blue/yellow region of the Schnyder forest
Connect the vertices with straight lines to finish the planar drawing with straight line