Projects in Data Structures and Algorithms

Sorts

A generalized version of insertion sort
Takes in parameter of g, where g is the gap between swaps
- For example: if g = 5, then every fifth element should be sorted with each other
- Insertion sort is the case where g = 1

Same as merge sort, but once the size of the split arrays becomes small enough, run insertion sort
Motivation: insertion sort works best with small lists, and the merge paradigm doesn’t work as well when there aren’t many elements

Donald Knuth’s definition of randomness: Each element is chosen randomly and independently from a distribution
Various sources of randomness
- Radioactive decay
- Radio noise
- Hardware noise
- If any of these sources are truly random, then it can be XOR’d with non-random sources to create a truly random outcome

Method of creating a random number
Choose two bits, discard if they are identical, use the first if they are not
- Analogous to flipping a coin repeatedly
- Computationally intensive
Keep generating random numbers until they fall into your desired range

Mimics the properties of randomness without the computational load
Important property: the output should not be determinable from previous outputs
- Known as being cryptographically strong
- Various hashing algorithms are known to be cryptographically strong, such as SHA-1 and MD5

Assign elements in a list a random, large number
Sort these elements according to their assignments; creates a random permutation

for k = n - 1 down to 1:
    j = random(k+1)
    swap arr[k] and arr[j]

Problem: pack items of weights 0 <= w_i <= 1 into bins with a capacity of 1 and find the minimum number of bins needed
NP-hard problem
Different than knapsack: trying to find smallest number of bins as opposed to most value
Metrics exist to determine how good a solution can be
- Approximation Ratio (AR): Approximated Solution / Optimal Solution
- Waste: Waste = Number of Bins - Sum of Item Weights

Check to see if item fits in the current bin
- If it does, place it there; if not, move to the next bin
AR will be no worse than 2

Scan bins in order and place the new item in the first (leftmost) bin that can hold it
- Create a new bin if it doesn’t fit anywhere
Can use a balanced binary tree to store bins to get an O(n log n) run time for the algorithm
- Use bin-index as the primary key, and store the maximum capacity available to determine if you can go left/right
AR is no worse than the ceiling of 1.7m

Same as FF, but leftmost restriction doesn’t apply
Binary search on bins for best-match capacity left; O(n log n)
After adding an item to a bin, you can delete the node and re-insert it with its new capacity

Mathmatical model
Any polynomial-time function has a polynomial-time algorithm on a Turing machine
Any computational model or system than can complete any operation a Turing machine can is known as Turing-complete
- The ENIAC (first digital computer) was Turing-complete

New model that consisted of a control unit, arithmetic/logic unit (ALU), and memory unit to store instructions and execute them
Built upon by the Random Access Machine (RAM)
Programming languages were built for the RAM and Von Neumann machines
- Based on high-level programming languages; consisted of artihmetic + logical operators, conditionals, loops, subroutines, memory access
Word RAM Model: Computational model in which a RAM can do arithmetic and bitwise operations on a word of w bits in constant time

Identifies he frontier between the two highest layers in the memory hierarchy
- B = block size
- M = “internal” memory size
- N = “external” input size
Can use a B-tree to maintain a map in external memory
- Adapted version of a (a,b) tree

Performs operations synchronously over p processors
- Exclusive Read (ER): p processors can simultaneously read the content of p distinct memory locations
- Concurrent Read (CR): p processors can simultaneously read the content of p’ memory locations where p’ < p
- Exclusive Write (EW): p processors can simultaneously write the content of p distinct memory locations
- Concurrent Write (CW): p processors can simultaneously write the content of p’ memory locations where p’ < p