Proving Hardness

Andy Novocin

Daily Challenge


The longest common substring (not subsequence) of two strings X and Y is the longest string that appears as a run of consecutive letters in both strings. For example, the longest common substring of photograph and tomography is ograph.

a) Let \(n = |X|\) and \(m = |Y|\). Give a \(\Theta(nm)\) dynamic programming algorithm for longest common substring based on the longest common subsequence/edit distance algorithm.

b) Give a simple \(\Theta(nm)\) algorithm that does not rely on dynamic programming.

Interview Question Cont.


You are given an array of \(n\) numbers, each of which may be positive, negative, or zero. Give an efficient algorithm to identify the index positions of \(i\) and \(j\) to maximize the sum of the \(i^{\textrm{th}}\) through \(j^{\textrm{th}}\) numbers.

ACM Variant


Given an array of 256 elements where each entry is -1 or 1 with equal probability. What is the expected value of the sum of a maximal contiguous sub-array?

Showing Hardness

Worst Case

Best Possible Case

Best Realistic Case

Learn to demonstrate


Even for your own sake, saving time, or just plain knowing when you are chasing windmills.

The role of NP


Using Non-determinism in your analysis is stupidly powerful.


Calling a problem solved if I can just verify a guess quickly...

Playing the lotto == Winning the lotto

NP exists


The reason we have NP as a class, is so that we can corale a large set of "hard" problems.


If we can "show" (\(P \neq NP\)) that we need a stupidly powerful assumption, then we've shown hardness.

Since P vs NP


Because we don't know if P is NP, we have to settle for comparing hardness.

Ghosts of Algorithm Students Past


NP reductions is the hardest part of this course

Abstract Nonsense!

Students LIKE solving a problem instance

Students do NOT LIKE transmogrifying (Calvin and Hobbes) all possible problem instances without solving the problem.

KEY CONCEPT


If every example of problem A can be MEANINGFULLY transformed into an example of problem B then problem B is at least as hard as problem A.


$$ A \leq B$$

First simple example


GCD \(\geq\) LCM


Someone gives me \(a,b\) and asks for Least Common Multiple.


I multiply \(a \cdot b\), divide by \( \gcd(a,b) \), and return the result.

LCM \(\geq\) GCD


Since \(a \cdot b = \gcd(a,b) \cdot \textrm{lcm}(a,b)\) we can reverse the trick.

Another "Easy to Easy" reduction


Sorting \(\leq\) Convex Hull

Convex Hull Example

To sort with convex hull do this:


Map each value in your input, \(x_i\) to the 2-D point \((x_i, x_i^2)\).


Walk counter clock-wise around your hull.

So when does this help?


Suppose problem X has unknown difficulty.


HARD \(\leq \) X shows that X is tough (if X can be used to solve problem HARD)


X \(\leq \) EASY shows that X is simple (an EASY problem can solve X)

When does it NOT help


X \(\leq\) HARD (a hard problem is at least as hard as X, not useful)


EASY \(\leq\) X (X can be used to solve an EASY problem)


If your reduction (the transformation steps) has too high of a complexity.

Board-Work


I would like to switch to chalkboard for the rest of the lecture. This is because working through the reductions is best done interactively. For those cramming or following along at home I will list the reductions that I intend to do. I would also recommend watching YouTube videos alongside the book-proofs. I find this stuff much easier to process from a human than from a book (that's just me).

SAT to 3SAT

2SAT to Strongly Connected Components

9-15 in the book

Hamiltonian Circuit to Traveling Saleman Problem

3-colorability to SAT

2-colorability to SAT (useless)

Vertex Cover equivalent to Independent Set equivalent to Clique

3SAT to 3-D Matching

3SAT to Vertex Cover

A hint at Vertex Cover to Hamiltonian Circuit

Neither P nor NP


Integer Factorization


Graph Isomorphism


Discrete Log

Inteview Question (SO/IRL)


Given an \( n \times n\) matrix of real numbers. You are allowed to erase any number (from 0 to \(n\)) of the rows and any number (from 0 to \(n\)) of the columns, and after that the sum of the remaining entries is computed. Come up with an algorithm which finds out which rows and columns to erase in order to maximize that sum.