Introduction

So far, we have focused on designing approximation algorithms with upper bounds on the approximation ratio. For example, we have a $2$‑approximation algorithm for the $k$-center problem.

Now we turn to proving lower bounds: showing that no algorithm can achieve a certain approximation factor unless $\mathsf{P} = \mathsf{NP}$. Several techniques are used:

Reductions from $\mathsf{NP}$-complete problems
Approximation factor preserving reductions
Reductions from Probabilistically Checkable Proofs (PCPs)
Reductions from Unique Games

This lecture covers the first two techniques.

Reductions from $\mathsf{NP}$-Complete Problems

The $k$-Center Problem

Definition (Recall).

Given a complete graph $G=(V,E)$ with distances $d_{ij}$ satisfying:

$d_{ii}=0$ for all $i$,
$d_{ij}=d_{ji}$,
$d_{ij}+d_{jk}\ge d_{ik}$ (triangle inequality),

and an integer $k$. For a set $S\subseteq V$ define the radius $$ \text{radius}(S)=\max_{i\in V}\min_{j\in S}d_{ij}. $$ We want to find a set $S$ with $|S|=k$ that minimizes the radius.

Theorem.
There is no $\alpha$-approximation algorithm for the $k$-center problem with $\alpha<2$, unless $\mathsf{P}=\mathsf{NP}$.

Proof.
We reduce from the dominating set problem:

Dominating set: Given a graph $H=(V,E)$ and an integer $k$, is there a set $S\subseteq V$ with $|S|\le k$ such that every vertex is either in $S$ or adjacent to a vertex in $S$?

Given an instance of dominating set, we construct an instance of $k$-center as follows. The vertex set is the same $V$. Define distances: $$ d_{ij} = \begin{cases} 1 & \text{if } {i,j}\in E,\ 2 & \text{otherwise.} \end{cases} $$ These distances satisfy the triangle inequality (because the maximum distance is 2).

If the dominating set instance has a dominating set of size $k$, then in the $k$-center instance we can take $S$ as that dominating set. Every vertex is either in $S$ or adjacent to a vertex in $S$, so its distance to $S$ is at most 1. Hence the optimal radius $r^* = 1$.
If the dominating set instance has no dominating set of size $k$, then any set of $k$ vertices leaves at least one vertex that is not in $S$ and not adjacent to any vertex in $S$. For that vertex the distance to $S$ is 2 (since all other distances are either 1 or 2). Thus $r^* = 2$.

Now suppose we had an $\alpha$-approximation algorithm for $k$-center with $\alpha<2$. Applied to the instance above, it would return a solution with radius $r\le \alpha r^*$.

If $r^*=1$, then $r\le\alpha<2$, so the algorithm must return a solution with radius $1$ (the only possible values are $1$ or $2$).
If $r^*=2$, then $r\le 2\alpha$. Since $\alpha<2$, $2\alpha$ could be $>2$ (e.g. $\alpha=1.9$ gives $3.8$), so the algorithm might return a radius $2$ or even larger. But crucially, we can decide whether the original instance has a dominating set by checking if the algorithm returns radius $1$ or not.

Thus we could solve dominating set in polynomial time, implying $\mathsf{P}=\mathsf{NP}$. ∎

General Scheme for Minimization Problems

We reduce an $\mathsf{NP}$-complete problem $\Pi$ to instances of the target problem $\Pi'$ such that:

YES instance of $\Pi$ maps to an instance of $\Pi'$ with optimum value $k$.
NO instance of $\Pi$ maps to an instance of $\Pi'$ with optimum value $\ge k+1$.

Then any $\alpha$-approximation algorithm with $\alpha < \frac{k+1}{k}$ would distinguish the two cases (since the algorithm’s output would be $<k+1$ in the YES case and $\ge k+1$ in the NO case), contradicting $\mathsf{NP}$-hardness.

A More Involved Example: Edge-Disjoint Paths

Problem (Edge-Disjoint Paths).
Given a directed graph $G=(V,A)$ and $k$ pairs $(s_1,t_1),\dots,(s_k,t_k)$, find a maximum size subset $S\subseteq\{1,\dots,k\}$ and edge‑disjoint paths $P_i$ from $s_i$ to $t_i$ for $i\in S$.

It is known that deciding whether two edge‑disjoint paths exist (i.e., $k=2$) is $\mathsf{NP}$-complete. This already gives:

\[ \text{No }\rho\text{-approximation for }\rho>\frac12\text{ unless }\mathsf{P}=\mathsf{NP}. \]

But a stronger inapproximability can be proved using an amplification construction.

Construction.
We build a new graph $G'$ by taking $k$ copies of the original graph (with vertices $a_i,b_i$ instead of $s_1,t_1$) and adding gadgets that force the paths to interact. The key properties:

If $G$ has two edge‑disjoint paths, then $G'$ has $k$ edge‑disjoint paths (one for each $a_i$–$b_i$ pair).
If $G$ has only one path, then $G'$ has at most one path among all pairs.

By choosing $k$ large (polynomial in $|A|$), we can amplify the difference. For any $\epsilon>0$, set $k=|A|^{\lfloor 1/\epsilon\rfloor}$. Then the number of arcs in $G'$ is $m = k^{2+\epsilon}$. An algorithm with approximation ratio $\Omega(m^{-1/2+\epsilon})$ would return more than one path in the YES case and at most one in the NO case, allowing us to decide the existence of two disjoint paths in $G$.

Theorem.
For any $\epsilon>0$, there is no $\Omega(m^{-1/2+\epsilon})$-approximation algorithm for the edge‑disjoint paths problem in directed graphs, unless $\mathsf{P}=\mathsf{NP}$.

Approximation Factor Preserving Reductions

Let $\Pi$ and $\Pi'$ be two minimization problems. An approximation factor preserving reduction consists of two polynomial‑time algorithms $f$ and $g$ such that:

For any instance $I$ of $\Pi$, $I' = f(I)$ is an instance of $\Pi'$ with $$ \text{OPT}{\Pi'}(I') \le \text{OPT}{\Pi}(I). $$
For any solution $t$ of $I'$, $s = g(I,t)$ is a solution of $I$ with $$ \text{obj}{\Pi}(I,s) \le \text{obj}{\Pi'}(I',t). $$

If there is an $\alpha$-approximation algorithm $\mathcal{A}$ for $\Pi'$, then $$ \alpha \ge \frac{\text{obj}{\Pi'}(I',t)}{\text{OPT}{\Pi'}(I')} \ge \frac{\text{obj}{\Pi}(I,s)}{\text{OPT}{\Pi}(I)}, $$ so $\mathcal{A}$, $f$ and $g$ give an $\alpha$-approximation algorithm for $\Pi$. The contrapositive: if $\Pi$ has no $\alpha$-approximation (unless $\mathsf{P}=\mathsf{NP}$), then $\Pi'$ also has no $\alpha$-approximation.

Example: From MAX E3SAT to MAX 2SAT

We reduce the maximum satisfiability problem for clauses with exactly three literals (MAX E3SAT) to MAX 2SAT (clauses with at most two literals).

For each clause $j$ of the E3SAT instance, say $j = (x_1\lor x_2\lor x_3)$, we create ten 2SAT clauses using the same variables and a new variable $y_j$: $$ (x_1),\;(x_2),\;(x_3),\; (\bar x_1\lor\bar x_2),\;(\bar x_2\lor\bar x_3),\;(\bar x_1\lor\bar x_3),\; (y_j),\;(x_1\lor\bar y_j),\;(x_2\lor\bar y_j),\;(x_3\lor\bar y_j). $$

If the original clause is satisfied, we can assign $y_j$ so that 7 of these 10 clauses are satisfied.
If all three literals are false, then at most 6 of the 10 can be satisfied (indeed, setting $y_j$ false gives 6, true gives 4).

Let $m$ be the number of E3SAT clauses, $k^*$ the number satisfied by an optimal solution. Then the optimal solution to the constructed MAX 2SAT instance satisfies $$ \text{OPT}_{2SAT} = 7k^ + 6(m-k^) = k^* + 6m. $$

Now suppose we have an $\alpha$-approximation algorithm for MAX 2SAT. It returns a solution that satisfies, say, $\tilde{k}$ groups of ten clauses with 7 satisfied and the remaining groups with 6 satisfied. Using the same assignment of the original variables, we satisfy $\tilde{k}$ E3SAT clauses. Hence $$ \tilde{k} \ge \text{OPT}{E3SAT} - (1-\alpha)\,\text{OPT}{2SAT}. $$

Because there is a randomized algorithm that satisfies at least $7/8$ of the clauses in any E3SAT instance, we have $k^* \ge \frac78 m$. Then $$ \text{OPT}{2SAT} = k^ + 6m \le k^ + \frac{48}{7}k^* = \frac{55}{7}\text{OPT}{E3SAT}. $$ Substituting, $$ \tilde{k} \ge \text{OPT}{E3SAT} - (1-\alpha)\frac{55}{7}\text{OPT}{E3SAT} = \left(\frac{55}{7}\alpha - \frac{48}{7}\right)\text{OPT}_{E3SAT}. $$ Thus we obtain a $\left(\frac{55}{7}\alpha - \frac{48}{7}\right)$-approximation for MAX E3SAT. It is known that MAX E3SAT cannot be approximated within any constant factor greater than $7/8$ unless $\mathsf{P}=\mathsf{NP}$. Therefore, for any $\alpha$ with $$ \frac{55}{7}\alpha - \frac{48}{7} > \frac78, $$ we would have a contradiction. Solving gives $\alpha > \frac{433}{440} \approx 0.984$.

Theorem.
There is no $\alpha$-approximation algorithm for MAX 2SAT with $\alpha > \frac{433}{440}$ unless $\mathsf{P}=\mathsf{NP}$.

L‑Reductions

L‑reductions are a stronger form of reduction that preserve approximability in a more precise way, often used for maximization problems.

Definition (L‑reduction).
Let $\Pi$ and $\Pi'$ be two maximization problems. A pair of polynomial‑time algorithms $(f,g)$ is an L‑reduction with parameters $a,b>0$ if:

For every instance $I$ of $\Pi$, $I' = f(I)$ is an instance of $\Pi'$ with $$ \text{OPT}{\Pi'}(I') \le a \cdot \text{OPT}{\Pi}(I). $$
For every solution $t$ of $I'$ with value $V'$, the solution $s = g(I,t)$ of $I$ has value $V$ satisfying $$ \text{OPT}{\Pi}(I) - V \le b \bigl(\text{OPT}{\Pi'}(I') - V'\bigr). $$

If there is an $\alpha$-approximation algorithm for $\Pi'$, then we obtain a $(1 - ab(1-\alpha))$-approximation for $\Pi$.

For minimization problems, the condition becomes: $$ V - \text{OPT}{\Pi}(I) \le b \bigl(V' - \text{OPT}{\Pi'}(I')\bigr), $$ and the resulting approximation ratio is $ab(\alpha-1)+1$.

Example: Maximum Coverage to Minimum Set Cover

Maximum Coverage Problem.
Given a ground set $E$ of $n$ elements, a collection $\mathcal{S}=\{S_1,\dots,S_m\}$ of subsets of $E$, and an integer $k$, choose at most $k$ sets to maximize the number of covered elements.

We give an L‑reduction from Minimum Set Cover to Maximum Coverage.

Suppose we have a $\gamma$-approximation algorithm $ALG$ for Maximum Coverage (with $0<\gamma<1$). We use it to approximate Minimum Set Cover: