## Sorting and friends

#### Andy Novocin

## Daily Problem

### Design a data structure to support the following operations:

`insert(x, T)`

Insert `x`

into the set `T`

`delete(k, T)`

Delete the \( k^{\textrm{th}} \) smallest element from `T`

`member(x, T)`

Detect if `x`

is in `T`

### All operations must take \( \mathcal{O}(\log{n})\).

## Advanced Problem

### Given \(n\) and a prime \( p\) return all integers \( i \geq 0 \) such that \( \binom {n} {i} \neq 0 \mod{p} \).

#### Can you do it in \( \mathcal{O}(t + \log{n}) \) time where \( t \) is the number of such \( i \)?

#### Sample output, \( n = 5, p = 2 \) result \( [ 0, 1, 4, 5 ] \).

## New HW posted

#### Due on September 25th (a little shorter)

## First test soon

### Covering Chapters 1-4 on September 30th.

#### This is important to your grade.

## BIG picture revisited

### Skills I want you to have

#### 1) Analysis of solutions to problems

#### 2) Mastery of a large set of useful techniques

#### 3) The ability to apply those techniques to new problems

#### 4) Awareness of a large set of difficult problems

#### 5) The ability to prove that a new problem is difficult

#### 6) Methods for attacking the difficult problems anyhow

## Sorting X-treme

#### Addressing the groans:

### Calculus vs Real Analysis vs Measure and Integration

### A very efficient tool for your toolbox

### Sandbox to introduce important concepts

### Part of the canon

## Count Comparisons

#### Design a tournament for exactly 5 teams.

#### You need to rank all 5 spots (not just champion).

#### Fewest number of games gets a prize.

## For large values?

### Each comparison adds some information

### Can be lucky can be unlucky

### Binary Tree?

## Lower Bounds

### So every possible permutation must be a leaf.

### Because every possible permutation is an input.

### The shortest possible tree has \( 2^h \) leaves.

### So \( n! \leq l \leq 2^h \) thus \( h \geq \sum_{i=1}^n \log{i} \)

### \( h = \Omega(n\log{n})\)

## Calculus reasoning

### \( \sum_{i=2}^n \log{i} \geq \int_1^{n} \log{x} dx \)

## Sort Roster

#### Quicksort expected \( \mathcal{O}(n \log{n} ) \)

#### Mergesort \( \mathcal{O}(n \log{n} ) \)

#### Heapsort \( \mathcal{O}(n \log{n} ) \)

#### Insertion Sort \( \mathcal{O}(n^2) \)

## Broken with pasta?

### Spaghetti Sort

#### 1) Cut spaghetti to the length of \( n_i \)

#### 2) Grab spaghetti and rest it vertically on a table

#### 3) Lower your hand until it touches a noodle

#### 4) Select that noodle first and repeat.

## Challenge problems

**4-29** Mr. B. C. Dull claims to have developed a new data structure for priority queues that supports `Insert`

, `Maximum`

, and `Extract-Max`

all in \( \mathcal{O}(1) \) time. Prove that he is mistaken.

#### What is the smallest possible depth of a leaf in a decision tree for a comparison sort?

## Priority Queue Search

### Selection Sort is `n`

applications of:

`Find-Minimum`

`Delete-Minimum`

#### We had it as \( \mathcal{O}(n^2) \) on an array.

#### A Priority Queue implemented with Balanced Binary Search Tree yields: \( \mathcal{O}(n \log{n} ) \)

#### Data structure matters!

## Heaps

### Not as good as Balanced Binary Search Trees, but simpler, and ideal for priority queues.

### A binary tree with the heap property: Each parent dominates its children.

### Also if we pack into an array then we save the pointer space (and lookup times)

## Storing method

### The \( 2^{l} \) nodes of the \( l^{\textrm{th}} \) row are packed into positions: \( 2^{l - 1}, \cdots, 2^{l} - 1 \)

### So the node at position \( x \) has left child at position \( 2x \) and right child at position \( 2x + 1 \)

### Reversed this means the parent of node \( x \) is at position \( \lceil x/2 \rceil \).

#### e.g. Packed in this way the height 3 balanced BST (not a heap) is packed as \( \{4, 2, 6, 1, 3, 5, 7 \} \)

## Insert

### To maintain a happy heap we must be packed and dominated

### Packed happens by inserting to the last spot

### For domination, swap with your pop, until things are right.

#### This is called `bubble_up`

### What is the complexity?

## Heap insert (also sort) example

## Delete-Min/Max

### Packed and Dominated

#### For packing we move the last child into the first spot

#### Now we have to swap down with the dominant child, until things are right.

#### This is called `bubble_down`

### What is the complexity?

## HEAPSORT

### Load the heap (for now \( \mathcal{O}(n \log{n} ) \) ).

### Call extract-min/max \( n \) times (each \( \mathcal{O}(\log{n}) \))

#### Cost is \( \mathcal{O}(n \log{n} ) \)

## Even faster loading

### Needs a board drawing but:

### Start at the back of an array and call `bubble-down`

### Creates many mini-heaps but their heights are smaller

#### \( \sum_{h=0}^{\lfloor \log{n} \rfloor} \lceil n / 2^{h+1} \rceil h \leq n \sum h/2^h \leq 2n \)

## Challenge 1

### Devise an algorithm for computing the \( k \) smallest elements from a set of \( n \) integers in \( \mathcal{O}(n + k \log{n} ) \) time.

## Challenge 2

### Where can the minimum element be in a max-heap?

## Challenge 3

### Build a heap from \( \{ 5, 3, 17, 10, 84, 19, 6, 22, 9 \} \).