Sorting and friends

Andy Novocin

Daily Problem

Design a data structure to support the following operations:

insert(x, T) Insert x into the set T

delete(k, T) Delete the \( k^{\textrm{th}} \) smallest element from T

member(x, T) Detect if x is in T

All operations must take \( \mathcal{O}(\log{n})\).

Advanced Problem

Given \(n\) and a prime \( p\) return all integers \( i \geq 0 \) such that \( \binom {n} {i} \neq 0 \mod{p} \).

Can you do it in \( \mathcal{O}(t + \log{n}) \) time where \( t \) is the number of such \( i \)?

Sample output, \( n = 5, p = 2 \) result \( [ 0, 1, 4, 5 ] \).

New HW posted

Due on September 25th (a little shorter)

First test soon

Covering Chapters 1-4 on September 30th.

This is important to your grade.

BIG picture revisited

Skills I want you to have

1) Analysis of solutions to problems

2) Mastery of a large set of useful techniques

3) The ability to apply those techniques to new problems

4) Awareness of a large set of difficult problems

5) The ability to prove that a new problem is difficult

6) Methods for attacking the difficult problems anyhow

Sorting X-treme

Addressing the groans:

Calculus vs Real Analysis vs Measure and Integration

A very efficient tool for your toolbox

Sandbox to introduce important concepts

Part of the canon

Count Comparisons

Design a tournament for exactly 5 teams.

You need to rank all 5 spots (not just champion).

Fewest number of games gets a prize.

Sample on 3

For large values?

Each comparison adds some information

Can be lucky can be unlucky

Binary Tree?

Lower Bounds

So every possible permutation must be a leaf.

Because every possible permutation is an input.

The shortest possible tree has \( 2^h \) leaves.

So \( n! \leq l \leq 2^h \) thus \( h \geq \sum_{i=1}^n \log{i} \)

\( h = \Omega(n\log{n})\)

Calculus reasoning

\( \sum_{i=2}^n \log{i} \geq \int_1^{n} \log{x} dx \)

Sort Roster

Quicksort expected \( \mathcal{O}(n \log{n} ) \)

Mergesort \( \mathcal{O}(n \log{n} ) \)

Heapsort \( \mathcal{O}(n \log{n} ) \)

Insertion Sort \( \mathcal{O}(n^2) \)

Broken with pasta?

Spaghetti Sort

1) Cut spaghetti to the length of \( n_i \)

2) Grab spaghetti and rest it vertically on a table

3) Lower your hand until it touches a noodle

4) Select that noodle first and repeat.

Challenge problems

4-29 Mr. B. C. Dull claims to have developed a new data structure for priority queues that supports Insert, Maximum, and Extract-Max all in \( \mathcal{O}(1) \) time. Prove that he is mistaken.

What is the smallest possible depth of a leaf in a decision tree for a comparison sort?

Priority Queue Search

Selection Sort is n applications of:



We had it as \( \mathcal{O}(n^2) \) on an array.

A Priority Queue implemented with Balanced Binary Search Tree yields: \( \mathcal{O}(n \log{n} ) \)

Data structure matters!


Not as good as Balanced Binary Search Trees, but simpler, and ideal for priority queues.

A binary tree with the heap property: Each parent dominates its children.

Also if we pack into an array then we save the pointer space (and lookup times)

Storing method

The \( 2^{l} \) nodes of the \( l^{\textrm{th}} \) row are packed into positions: \( 2^{l - 1}, \cdots, 2^{l} - 1 \)

So the node at position \( x \) has left child at position \( 2x \) and right child at position \( 2x + 1 \)

Reversed this means the parent of node \( x \) is at position \( \lceil x/2 \rceil \).

e.g. Packed in this way the height 3 balanced BST (not a heap) is packed as \( \{4, 2, 6, 1, 3, 5, 7 \} \)


To maintain a happy heap we must be packed and dominated

Packed happens by inserting to the last spot

For domination, swap with your pop, until things are right.

This is called bubble_up

What is the complexity?

Heap insert (also sort) example


Packed and Dominated

For packing we move the last child into the first spot

Now we have to swap down with the dominant child, until things are right.

This is called bubble_down

What is the complexity?


Load the heap (for now \( \mathcal{O}(n \log{n} ) \) ).

Call extract-min/max \( n \) times (each \( \mathcal{O}(\log{n}) \))

Cost is \( \mathcal{O}(n \log{n} ) \)

Even faster loading

Needs a board drawing but:

Start at the back of an array and call bubble-down

Creates many mini-heaps but their heights are smaller

\( \sum_{h=0}^{\lfloor \log{n} \rfloor} \lceil n / 2^{h+1} \rceil h \leq n \sum h/2^h \leq 2n \)

Challenge 1

Devise an algorithm for computing the \( k \) smallest elements from a set of \( n \) integers in \( \mathcal{O}(n + k \log{n} ) \) time.

Challenge 2

Where can the minimum element be in a max-heap?

Challenge 3

Build a heap from \( \{ 5, 3, 17, 10, 84, 19, 6, 22, 9 \} \).