CISC 320 Asymptotic Analysis

Analyzing Algorithms part 1

Andy Novocin

Daily Problem

The set cover problem: given a set of subsets $S = \{S_1, \ldots, S_m\} $ each in $U = \{1, 2, \ldots, n \}$ find the smallest subset of subsets $ T \subset S $ such that the union is $ U $.

Find a counter-example to the algorithm: Select the largest subset first, then delete those elements from $ U $. Repeat by selecting the subset containing the next most uncovered elements until everything is covered.

Can you devise an algorithm which will correctly terminate?

big-Oh Face

Algorithms are ideas, right?

How do I judge them?

Correctness, Efficiency, Simplicity

A Dumb Example:

for i in range(n):
  print i*i + i
print n + n

How long did that take?

A function of `n`

`n` multiplications of numbers $ \leq n $

`n` additions of numbers $ \leq n $

one additional addition of size $ n $

`n+1` calls to `print`

???

$ \sum_{i=1}^n 3\log_{2^{53}}{i} + \log_{2^{53}}{n} + n \cdot \mathcal{C}(\textrm{print}) $

Even a simple bit of code is difficult to analyze fully

What if there is no specific implementation?

gcd($a,b$):
if $ a \leq b $ return gcd($b, a$)
$r := a \mod b$
if $ r \neq 0 $ return gcd($ b, r $)
else return $b$

We need a model

Computational Model: RAM

Simple operations take 1 step

+,-,*,==,if,call

Memory is unlimited

Any memory address can be accessed in 1 step

Recursive calls, Loops, For are not simple and require care

RAM model allows us to count operations on a theoretical machine

It misses many small adjustments

But like Newtownian mechanics it's useful!

GCD (Euclidean algorithm)

`mod` and `if` are simple so cost 1

repeated 1-3 times per recursive call

How many recursive calls can there be?

Best, Average, Worst cases

Best case for gcd is $a=x, b=x$ cost is 3

Average case, well, that depends IDEONE experiment showing 18.1799 for 32 bit integers.

Worst case is tricky, but Fibonacci bounded (inductive proof)

Approx 3 operations per digit (proof on board, if demanded)

In general, which number is most meaningful?

That's nice but how does the $ \mathcal{O}() $ come into the picture?

Orthogonal to the complexity model, but helpful in thinking it out

It is a way to compare any functions on large inputs

The algorithmic world needs to compare efficiencies

RAM lets us model the cost of an algorithm as a function

big-Oh is used to compare those functions

big-Oh in action

$ 2n^3 + 2013n^2 + 2 = \mathcal{O}(n^3) $

$ 2n^3 - 2013n^2 + 2 = \mathcal{O}(n^3) $

$ \sum_{i=0}^{4n} 1519 = \mathcal{O}(n) $

$ \sum_{i=0}^{4n} 1519i = \mathcal{O}(n^2) $

$ 12 \cdot 2^{x} \cdot 2^{-x} = \mathcal{O}(1) $

$ 12 \cdot 2^{x} \cdot x^{2} = \mathcal{O}(x^2 \cdot 2^{x}) $

Thinking in $\mathcal{O}$

Follow the largest term

Ignore constant multipliers

Adding two terms collapses them (be careful)

Multiplying two terms combines them (be careful)

Big-Oh, is $\mathcal{O}()$ many things

Computer scientists are lazy (it's easy to work with)

Computer scientists are transcendent (future $ \geq $ today)

Follow the definition

$$ \mathcal{O}(g(x)) = \{ f(x) \textrm{ s.t. } \exists c, n \textrm{ with } f(x) \leq c \cdot g(x) ~\forall ~x \geq n \} $$

In words: $ \mathcal{O}(g(x)) $ is the set of all functions which are eventually smaller than a constant multiple of $ g(x) $

$ f(x) = \mathcal{O}(g(x)) $ actually means $ f \in \mathcal{O}(g(x)) $

Sample Analysis

$ f(n) = 2n^2 - 20n + 4 $

$f(n) = \mathcal{O}(n^2) $ because $2n^2 \geq 2n^2 - 20n + 4$

$f(n) = \mathcal{O}(n^3) $ because $ \frac{1}{100} n^3 \geq f(n) $

$f(n) \neq \mathcal{O}(n) $ because $ c \cdot n < n^2 $ when $ n > c $

The comparison family

For all large enough $ x $
$ f(x) = \omega(g(x)) $ when $ \lim_{x \to \infty} \frac{g(x)}{f(x)} = 0 $
$ f(x) = \Omega(g(x)) $ when $ f(x) \geq C \cdot g(x) $
$ f(x) = \Theta(g(x)) $ when $ f(x) \geq C_1 \cdot g(x) $ and $ f(x) \leq C_2 \cdot g(x) $
$ f(x) = \mathcal{O}(g(x)) $ when $ f(x) \leq C \cdot g(x) $
$ f(x) = o(g(x)) $ when $ \lim_{x \to \infty} \frac{f(x)}{g(x)} = 0 $

Find your partner, stretch, and answer this:

Which of $ \omega, \Omega, \Theta, \mathcal{O}, o $ describe the relationship between $ 2^x $ and $ 3^x $?

Put it all together

Insertion Sort:

for (i = 1; i < n; i++){
  j = i;
  while ((j > 0) && (s[j] < s[j-1])){
    swap(s[j], s[j-1]);
    j = j-1;
  }
}

Count the worst-case `swap`s.

Project Euler Problem 4

A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.

Analyzing Algorithms part 1

Andy Novocin

Daily Problem

Can you devise an algorithm which will correctly terminate?

big-Oh Face

Algorithms are ideas, right?

How do I judge them?

Correctness, Efficiency, Simplicity

A Dumb Example:

How long did that take?

A function of n

n multiplications of numbers \( \leq n \)

n additions of numbers \( \leq n \)

one additional addition of size \( n \)

n+1 calls to print

???

\( \sum_{i=1}^n 3\log_{2^{53}}{i} + \log_{2^{53}}{n} + n \cdot \mathcal{C}(\textrm{print}) \)

Even a simple bit of code is difficult to analyze fully

What if there is no specific implementation?

We need a model

Computational Model: RAM

Simple operations take 1 step

+,-,*,==,if,call

Memory is unlimited

Any memory address can be accessed in 1 step

Recursive calls, Loops, For are not simple and require care

RAM model allows us to count operations on a theoretical machine

It misses many small adjustments

But like Newtownian mechanics it's useful!

GCD (Euclidean algorithm)

mod and if are simple so cost 1

repeated 1-3 times per recursive call

How many recursive calls can there be?

Best, Average, Worst cases

Best case for gcd is \(a=x, b=x\) cost is 3

Average case, well, that depends IDEONE experiment showing 18.1799 for 32 bit integers.

Worst case is tricky, but Fibonacci bounded (inductive proof)

Approx 3 operations per digit (proof on board, if demanded)

In general, which number is most meaningful?

That's nice but how does the \( \mathcal{O}() \) come into the picture?

Orthogonal to the complexity model, but helpful in thinking it out

It is a way to compare any functions on large inputs

The algorithmic world needs to compare efficiencies

RAM lets us model the cost of an algorithm as a function

big-Oh is used to compare those functions

big-Oh in action

\( 2n^3 + 2013n^2 + 2 = \mathcal{O}(n^3) \)

\( 2n^3 - 2013n^2 + 2 = \mathcal{O}(n^3) \)

\( \sum_{i=0}^{4n} 1519 = \mathcal{O}(n) \)

\( \sum_{i=0}^{4n} 1519i = \mathcal{O}(n^2) \)

\( 12 \cdot 2^{x} \cdot 2^{-x} = \mathcal{O}(1) \)

\( 12 \cdot 2^{x} \cdot x^{2} = \mathcal{O}(x^2 \cdot 2^{x}) \)

Thinking in \(\mathcal{O}\)

Follow the largest term

Ignore constant multipliers

Adding two terms collapses them (be careful)

Multiplying two terms combines them (be careful)

Big-Oh, is \(\mathcal{O}()\) many things

Computer scientists are lazy (it's easy to work with)

Computer scientists are transcendent (future \( \geq \) today)

Follow the definition

In words: \( \mathcal{O}(g(x)) \) is the set of all functions which are eventually smaller than a constant multiple of \( g(x) \)

\( f(x) = \mathcal{O}(g(x)) \) actually means \( f \in \mathcal{O}(g(x)) \)

Sample Analysis

\( f(n) = 2n^2 - 20n + 4 \)

\(f(n) = \mathcal{O}(n^2) \) because \(2n^2 \geq 2n^2 - 20n + 4\)

\(f(n) = \mathcal{O}(n^3) \) because \( \frac{1}{100} n^3 \geq f(n) \)

\(f(n) \neq \mathcal{O}(n) \) because \( c \cdot n < n^2 \) when \( n > c \)

The comparison family

Find your partner, stretch, and answer this:

Which of \( \omega, \Omega, \Theta, \mathcal{O}, o \) describe the relationship between \( 2^x \) and \( 3^x \)?

Put it all together

Insertion Sort:

Count the worst-case swaps.

Project Euler Problem 4

Is there time for more?If YES then Go To Slide #22If NO then Turn The Page

You are eaten by crocodile aliens from the future protection society.

THE END

A function of `n`

`n` multiplications of numbers \( \leq n \)

`n` additions of numbers \( \leq n \)

`n+1` calls to `print`

`mod` and `if` are simple so cost 1

Count the worst-case `swap`s.

Is there time for more?
If YES then Go To Slide #22
If NO then Turn The Page