Sorting and friends
Andy Novocin
Daily Problem
Design a data structure to support the following operations:
insert(x, T)
Insert x
into the set T
delete(k, T)
Delete the \( k^{\textrm{th}} \) smallest element from T
member(x, T)
Detect if x
is in T
All operations must take \( \mathcal{O}(\log{n})\).
Advanced Problem
Given \(n\) and a prime \( p\) return all integers \( i \geq 0 \) such that \( \binom {n} {i} \neq 0 \mod{p} \).
Can you do it in \( \mathcal{O}(t + \log{n}) \) time where \( t \) is the number of such \( i \)?
Sample output, \( n = 5, p = 2 \) result \( [ 0, 1, 4, 5 ] \).
New HW posted
Due on September 25th (a little shorter)
First test soon
Covering Chapters 1-4 on September 30th.
This is important to your grade.
BIG picture revisited
Skills I want you to have
1) Analysis of solutions to problems
2) Mastery of a large set of useful techniques
3) The ability to apply those techniques to new problems
4) Awareness of a large set of difficult problems
5) The ability to prove that a new problem is difficult
6) Methods for attacking the difficult problems anyhow
Sorting X-treme
Addressing the groans:
Calculus vs Real Analysis vs Measure and Integration
A very efficient tool for your toolbox
Sandbox to introduce important concepts
Part of the canon
Count Comparisons
Design a tournament for exactly 5 teams.
You need to rank all 5 spots (not just champion).
Fewest number of games gets a prize.
For large values?
Each comparison adds some information
Can be lucky can be unlucky
Binary Tree?
Lower Bounds
So every possible permutation must be a leaf.
Because every possible permutation is an input.
The shortest possible tree has \( 2^h \) leaves.
So \( n! \leq l \leq 2^h \) thus \( h \geq \sum_{i=1}^n \log{i} \)
\( h = \Omega(n\log{n})\)
Calculus reasoning
\( \sum_{i=2}^n \log{i} \geq \int_1^{n} \log{x} dx \)
Sort Roster
Quicksort expected \( \mathcal{O}(n \log{n} ) \)
Mergesort \( \mathcal{O}(n \log{n} ) \)
Heapsort \( \mathcal{O}(n \log{n} ) \)
Insertion Sort \( \mathcal{O}(n^2) \)
Broken with pasta?
Spaghetti Sort
1) Cut spaghetti to the length of \( n_i \)
2) Grab spaghetti and rest it vertically on a table
3) Lower your hand until it touches a noodle
4) Select that noodle first and repeat.
Challenge problems
4-29 Mr. B. C. Dull claims to have developed a new data structure for priority queues that supports Insert
, Maximum
, and Extract-Max
all in \( \mathcal{O}(1) \) time. Prove that he is mistaken.
What is the smallest possible depth of a leaf in a decision tree for a comparison sort?
Priority Queue Search
Selection Sort is n
applications of:
Find-Minimum
Delete-Minimum
We had it as \( \mathcal{O}(n^2) \) on an array.
A Priority Queue implemented with Balanced Binary Search Tree yields: \( \mathcal{O}(n \log{n} ) \)
Data structure matters!
Heaps
Not as good as Balanced Binary Search Trees, but simpler, and ideal for priority queues.
A binary tree with the heap property: Each parent dominates its children.
Also if we pack into an array then we save the pointer space (and lookup times)
Storing method
The \( 2^{l} \) nodes of the \( l^{\textrm{th}} \) row are packed into positions: \( 2^{l - 1}, \cdots, 2^{l} - 1 \)
So the node at position \( x \) has left child at position \( 2x \) and right child at position \( 2x + 1 \)
Reversed this means the parent of node \( x \) is at position \( \lceil x/2 \rceil \).
e.g. Packed in this way the height 3 balanced BST (not a heap) is packed as \( \{4, 2, 6, 1, 3, 5, 7 \} \)
Insert
To maintain a happy heap we must be packed and dominated
Packed happens by inserting to the last spot
For domination, swap with your pop, until things are right.
This is called bubble_up
What is the complexity?
Heap insert (also sort) example
Delete-Min/Max
Packed and Dominated
For packing we move the last child into the first spot
Now we have to swap down with the dominant child, until things are right.
This is called bubble_down
What is the complexity?
HEAPSORT
Load the heap (for now \( \mathcal{O}(n \log{n} ) \) ).
Call extract-min/max \( n \) times (each \( \mathcal{O}(\log{n}) \))
Cost is \( \mathcal{O}(n \log{n} ) \)
Even faster loading
Needs a board drawing but:
Start at the back of an array and call bubble-down
Creates many mini-heaps but their heights are smaller
\( \sum_{h=0}^{\lfloor \log{n} \rfloor} \lceil n / 2^{h+1} \rceil h \leq n \sum h/2^h \leq 2n \)
Challenge 1
Devise an algorithm for computing the \( k \) smallest elements from a set of \( n \) integers in \( \mathcal{O}(n + k \log{n} ) \) time.
Challenge 2
Where can the minimum element be in a max-heap?
Challenge 3
Build a heap from \( \{ 5, 3, 17, 10, 84, 19, 6, 22, 9 \} \).