\[ \newcommand{\undefined}{} \newcommand{\hfill}{} \newcommand{\qedhere}{\square} \newcommand{\qed}{\square} \newcommand{\bbA}{\mathbb A} \newcommand{\bbB}{\mathbb B} \newcommand{\bbC}{\mathbb C} \newcommand{\bbD}{\mathbb D} \newcommand{\bbE}{\mathbb E} \newcommand{\bbF}{\mathbb F} \newcommand{\bbG}{\mathbb G} \newcommand{\bbH}{\mathbb H} \newcommand{\bbI}{\mathbb I} \newcommand{\bbJ}{\mathbb J} \newcommand{\bbK}{\mathbb K} \newcommand{\bbL}{\mathbb L} \newcommand{\bbM}{\mathbb M} \newcommand{\bbN}{\mathbb N} \newcommand{\bbO}{\mathbb O} \newcommand{\bbP}{\mathbb P} \newcommand{\bbQ}{\mathbb Q} \newcommand{\bbR}{\mathbb R} \newcommand{\bbS}{\mathbb S} \newcommand{\bbT}{\mathbb T} \newcommand{\bbU}{\mathbb U} \newcommand{\bbV}{\mathbb V} \newcommand{\bbW}{\mathbb W} \newcommand{\bbX}{\mathbb X} \newcommand{\bbY}{\mathbb Y} \newcommand{\bbZ}{\mathbb Z} \newcommand{\sA}{\mathscr A} \newcommand{\sB}{\mathscr B} \newcommand{\sC}{\mathscr C} \newcommand{\sD}{\mathscr D} \newcommand{\sE}{\mathscr E} \newcommand{\sF}{\mathscr F} \newcommand{\sG}{\mathscr G} \newcommand{\sH}{\mathscr H} \newcommand{\sI}{\mathscr I} \newcommand{\sJ}{\mathscr J} \newcommand{\sK}{\mathscr K} \newcommand{\sL}{\mathscr L} \newcommand{\sM}{\mathscr M} \newcommand{\sN}{\mathscr N} \newcommand{\sO}{\mathscr O} \newcommand{\sP}{\mathscr P} \newcommand{\sQ}{\mathscr Q} \newcommand{\sR}{\mathscr R} \newcommand{\sS}{\mathscr S} \newcommand{\sT}{\mathscr T} \newcommand{\sU}{\mathscr U} \newcommand{\sV}{\mathscr V} \newcommand{\sW}{\mathscr W} \newcommand{\sX}{\mathscr X} \newcommand{\sY}{\mathscr Y} \newcommand{\sZ}{\mathscr Z} \newcommand{\sfA}{\mathsf A} \newcommand{\sfB}{\mathsf B} \newcommand{\sfC}{\mathsf C} \newcommand{\sfD}{\mathsf D} \newcommand{\sfE}{\mathsf E} \newcommand{\sfF}{\mathsf F} \newcommand{\sfG}{\mathsf G} \newcommand{\sfH}{\mathsf H} \newcommand{\sfI}{\mathsf I} \newcommand{\sfJ}{\mathsf J} \newcommand{\sfK}{\mathsf K} \newcommand{\sfL}{\mathsf L} \newcommand{\sfM}{\mathsf M} \newcommand{\sfN}{\mathsf N} \newcommand{\sfO}{\mathsf O} \newcommand{\sfP}{\mathsf P} \newcommand{\sfQ}{\mathsf Q} \newcommand{\sfR}{\mathsf R} \newcommand{\sfS}{\mathsf S} \newcommand{\sfT}{\mathsf T} \newcommand{\sfU}{\mathsf U} \newcommand{\sfV}{\mathsf V} \newcommand{\sfW}{\mathsf W} \newcommand{\sfX}{\mathsf X} \newcommand{\sfY}{\mathsf Y} \newcommand{\sfZ}{\mathsf Z} \newcommand{\cA}{\mathcal A} \newcommand{\cB}{\mathcal B} \newcommand{\cC}{\mathcal C} \newcommand{\cD}{\mathcal D} \newcommand{\cE}{\mathcal E} \newcommand{\cF}{\mathcal F} \newcommand{\cG}{\mathcal G} \newcommand{\cH}{\mathcal H} \newcommand{\cI}{\mathcal I} \newcommand{\cJ}{\mathcal J} \newcommand{\cK}{\mathcal K} \newcommand{\cL}{\mathcal L} \newcommand{\cM}{\mathcal M} \newcommand{\cN}{\mathcal N} \newcommand{\cO}{\mathcal O} \newcommand{\cP}{\mathcal P} \newcommand{\cQ}{\mathcal Q} \newcommand{\cR}{\mathcal R} \newcommand{\cS}{\mathcal S} \newcommand{\cT}{\mathcal T} \newcommand{\cU}{\mathcal U} \newcommand{\cV}{\mathcal V} \newcommand{\cW}{\mathcal W} \newcommand{\cX}{\mathcal X} \newcommand{\cY}{\mathcal Y} \newcommand{\cZ}{\mathcal Z} \newcommand{\bfA}{\mathbf A} \newcommand{\bfB}{\mathbf B} \newcommand{\bfC}{\mathbf C} \newcommand{\bfD}{\mathbf D} \newcommand{\bfE}{\mathbf E} \newcommand{\bfF}{\mathbf F} \newcommand{\bfG}{\mathbf G} \newcommand{\bfH}{\mathbf H} \newcommand{\bfI}{\mathbf I} \newcommand{\bfJ}{\mathbf J} \newcommand{\bfK}{\mathbf K} \newcommand{\bfL}{\mathbf L} \newcommand{\bfM}{\mathbf M} \newcommand{\bfN}{\mathbf N} \newcommand{\bfO}{\mathbf O} \newcommand{\bfP}{\mathbf P} \newcommand{\bfQ}{\mathbf Q} \newcommand{\bfR}{\mathbf R} \newcommand{\bfS}{\mathbf S} \newcommand{\bfT}{\mathbf T} \newcommand{\bfU}{\mathbf U} \newcommand{\bfV}{\mathbf V} \newcommand{\bfW}{\mathbf W} \newcommand{\bfX}{\mathbf X} \newcommand{\bfY}{\mathbf Y} \newcommand{\bfZ}{\mathbf Z} \newcommand{\rmA}{\mathrm A} \newcommand{\rmB}{\mathrm B} \newcommand{\rmC}{\mathrm C} \newcommand{\rmD}{\mathrm D} \newcommand{\rmE}{\mathrm E} \newcommand{\rmF}{\mathrm F} \newcommand{\rmG}{\mathrm G} \newcommand{\rmH}{\mathrm H} \newcommand{\rmI}{\mathrm I} \newcommand{\rmJ}{\mathrm J} \newcommand{\rmK}{\mathrm K} \newcommand{\rmL}{\mathrm L} \newcommand{\rmM}{\mathrm M} \newcommand{\rmN}{\mathrm N} \newcommand{\rmO}{\mathrm O} \newcommand{\rmP}{\mathrm P} \newcommand{\rmQ}{\mathrm Q} \newcommand{\rmR}{\mathrm R} \newcommand{\rmS}{\mathrm S} \newcommand{\rmT}{\mathrm T} \newcommand{\rmU}{\mathrm U} \newcommand{\rmV}{\mathrm V} \newcommand{\rmW}{\mathrm W} \newcommand{\rmX}{\mathrm X} \newcommand{\rmY}{\mathrm Y} \newcommand{\rmZ}{\mathrm Z} \newcommand{\paren}[1]{( #1 )} \newcommand{\Paren}[1]{\left( #1 \right)} \newcommand{\bigparen}[1]{\bigl( #1 \bigr)} \newcommand{\Bigparen}[1]{\Bigl( #1 \Bigr)} \newcommand{\biggparen}[1]{\biggl( #1 \biggr)} \newcommand{\Biggparen}[1]{\Biggl( #1 \Biggr)} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\Abs}[1]{\left\lvert #1 \right\rvert} \newcommand{\bigabs}[1]{\bigl\lvert #1 \bigr\rvert} \newcommand{\Bigabs}[1]{\Bigl\lvert #1 \Bigr\rvert} \newcommand{\biggabs}[1]{\biggl\lvert #1 \biggr\rvert} \newcommand{\Biggabs}[1]{\Biggl\lvert #1 \Biggr\rvert} \newcommand{\card}[1]{\lvert #1 \rvert} \newcommand{\Card}[1]{\left\lvert #1 \right\rvert} \newcommand{\bigcard}[1]{\bigl\lvert #1 \bigr\rvert} \newcommand{\Bigcard}[1]{\Bigl\lvert #1 \Bigr\rvert} \newcommand{\biggcard}[1]{\biggl\lvert #1 \biggr\rvert} \newcommand{\Biggcard}[1]{\Biggl\lvert #1 \Biggr\rvert} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\Norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\bignorm}[1]{\bigl\lVert #1 \bigr\rVert} \newcommand{\Bignorm}[1]{\Bigl\lVert #1 \Bigr\rVert} \newcommand{\biggnorm}[1]{\biggl\lVert #1 \biggr\rVert} \newcommand{\Biggnorm}[1]{\Biggl\lVert #1 \Biggr\rVert} \newcommand{\iprod}[1]{\langle #1 \rangle} \newcommand{\Iprod}[1]{\left\langle #1 \right\rangle} \newcommand{\bigiprod}[1]{\bigl\langle #1 \bigr\rangle} \newcommand{\Bigiprod}[1]{\Bigl\langle #1 \Bigr\rangle} \newcommand{\biggiprod}[1]{\biggl\langle #1 \biggr\rangle} \newcommand{\Biggiprod}[1]{\Biggl\langle #1 \Biggr\rangle} \newcommand{\set}[1]{\lbrace #1 \rbrace} \newcommand{\Set}[1]{\left\lbrace #1 \right\rbrace} \newcommand{\bigset}[1]{\bigl\lbrace #1 \bigr\rbrace} \newcommand{\Bigset}[1]{\Bigl\lbrace #1 \Bigr\rbrace} \newcommand{\biggset}[1]{\biggl\lbrace #1 \biggr\rbrace} \newcommand{\Biggset}[1]{\Biggl\lbrace #1 \Biggr\rbrace} \newcommand{\bracket}[1]{\lbrack #1 \rbrack} \newcommand{\Bracket}[1]{\left\lbrack #1 \right\rbrack} \newcommand{\bigbracket}[1]{\bigl\lbrack #1 \bigr\rbrack} \newcommand{\Bigbracket}[1]{\Bigl\lbrack #1 \Bigr\rbrack} \newcommand{\biggbracket}[1]{\biggl\lbrack #1 \biggr\rbrack} \newcommand{\Biggbracket}[1]{\Biggl\lbrack #1 \Biggr\rbrack} \newcommand{\ucorner}[1]{\ulcorner #1 \urcorner} \newcommand{\Ucorner}[1]{\left\ulcorner #1 \right\urcorner} \newcommand{\bigucorner}[1]{\bigl\ulcorner #1 \bigr\urcorner} \newcommand{\Bigucorner}[1]{\Bigl\ulcorner #1 \Bigr\urcorner} \newcommand{\biggucorner}[1]{\biggl\ulcorner #1 \biggr\urcorner} \newcommand{\Biggucorner}[1]{\Biggl\ulcorner #1 \Biggr\urcorner} \newcommand{\ceil}[1]{\lceil #1 \rceil} \newcommand{\Ceil}[1]{\left\lceil #1 \right\rceil} \newcommand{\bigceil}[1]{\bigl\lceil #1 \bigr\rceil} \newcommand{\Bigceil}[1]{\Bigl\lceil #1 \Bigr\rceil} \newcommand{\biggceil}[1]{\biggl\lceil #1 \biggr\rceil} \newcommand{\Biggceil}[1]{\Biggl\lceil #1 \Biggr\rceil} \newcommand{\floor}[1]{\lfloor #1 \rfloor} \newcommand{\Floor}[1]{\left\lfloor #1 \right\rfloor} \newcommand{\bigfloor}[1]{\bigl\lfloor #1 \bigr\rfloor} \newcommand{\Bigfloor}[1]{\Bigl\lfloor #1 \Bigr\rfloor} \newcommand{\biggfloor}[1]{\biggl\lfloor #1 \biggr\rfloor} \newcommand{\Biggfloor}[1]{\Biggl\lfloor #1 \Biggr\rfloor} \newcommand{\lcorner}[1]{\llcorner #1 \lrcorner} \newcommand{\Lcorner}[1]{\left\llcorner #1 \right\lrcorner} \newcommand{\biglcorner}[1]{\bigl\llcorner #1 \bigr\lrcorner} \newcommand{\Biglcorner}[1]{\Bigl\llcorner #1 \Bigr\lrcorner} \newcommand{\bigglcorner}[1]{\biggl\llcorner #1 \biggr\lrcorner} \newcommand{\Bigglcorner}[1]{\Biggl\llcorner #1 \Biggr\lrcorner} \newcommand{\e}{\varepsilon} \newcommand{\eps}{\varepsilon} \newcommand{\from}{\colon} \newcommand{\super}[2]{#1^{(#2)}} \newcommand{\varsuper}[2]{#1^{\scriptscriptstyle (#2)}} \newcommand{\tensor}{\otimes} \newcommand{\eset}{\emptyset} \newcommand{\sse}{\subseteq} \newcommand{\sst}{\substack} \newcommand{\ot}{\otimes} \newcommand{\Esst}[1]{\bbE_{\substack{#1}}} \newcommand{\vbig}{\vphantum{\bigoplus}} \newcommand{\seteq}{\mathrel{\mathop:}=} \newcommand{\defeq}{\stackrel{\mathrm{def}}=} \newcommand{\bits}{\{0,1\}} \newcommand{\sbits}{\{\pm 1\}} \newcommand{\Mid}{\mathrel{}\middle|\mathrel{}} \newcommand{\Ind}{\mathbf 1} \newcommand{\R}{\mathbb R} \newcommand{\Rnn}{\R_{\ge 0}} \newcommand{\N}{\mathbb N} \newcommand{\Z}{\mathbb Z} \newcommand{\Q}{\mathbb Q} \newcommand{\mper}{\,.} \newcommand{\mcom}{\,,} \DeclareMathOperator{\Id}{Id} \DeclareMathOperator{\cone}{cone} \DeclareMathOperator{\vol}{vol} \DeclareMathOperator{\val}{val} \DeclareMathOperator{\opt}{opt} \DeclareMathOperator{\Opt}{Opt} \DeclareMathOperator{\Val}{Val} \DeclareMathOperator{\LP}{LP} \DeclareMathOperator{\SDP}{SDP} \DeclareMathOperator{\Tr}{Tr} \DeclareMathOperator{\Inf}{Inf} \DeclareMathOperator{\poly}{poly} \DeclareMathOperator{\polylog}{polylog} \DeclareMathOperator{\argmax}{arg\,max} \DeclareMathOperator{\argmin}{arg\,min} \DeclareMathOperator{\qpoly}{qpoly} \DeclareMathOperator{\qqpoly}{qqpoly} \DeclareMathOperator{\conv}{conv} \DeclareMathOperator{\Conv}{Conv} \DeclareMathOperator{\supp}{supp} \DeclareMathOperator{\sign}{sign} \DeclareMathOperator{\mspan}{span} \DeclareMathOperator{\mrank}{rank} \DeclareMathOperator{\E}{\mathbb E} \DeclareMathOperator{\pE}{\tilde{\mathbb E}} \DeclareMathOperator{\Pr}{\mathbb P} \DeclareMathOperator{\Span}{Span} \DeclareMathOperator{\Cone}{Cone} \DeclareMathOperator{\junta}{junta} \DeclareMathOperator{\NSS}{NSS} \DeclareMathOperator{\SA}{SA} \DeclareMathOperator{\SOS}{SOS} \hyperbaseurl{http://toc15.dsteurer.org/} \notag \]

PCP theorem: gap amplification (pdf version)

Introduction

In this note, we prove the following theorem, which is a major step toward a proof of the PCP theorem.

Recall that in the optimization problem \({2\text{-CSP}_W}\), we are given a list of arity-\(2\) constraints between variables over an alphabet of size \(W\) and the goal is to find an assignment to the variables that satisfies as many of the constraints as possible.

A \({2\text{-CSP}_W}\) instance \(\varphi\) is \(d\)-regular if every variable of \(\varphi\) appears in exactly \(d\) of the constraints of \(\varphi\). It turns out that there is an efficient reduction to transform general instances of \({2\text{-CSP}_W}\) to \(d\)-regular ones without significantly affecting the optimal value of the instance.

Theorem (gap amplification). For all \(d,t,W\in \bbN\), there exists a polynomial-time linear-blowup function \(f\) that maps every \(d\)-regular \({2\text{-CSP}_W}\) instance \(\varphi\) to a \({2\text{-CSP}_W'}\) instance \(\varphi'=f(\varphi)\),

  • YES: if \(\opt(\varphi)=1\), then \(\opt(\varphi')=1\)
  • NO: if \(\opt(\varphi)<1-\e\) for some \(\e < 1/(t d)\), then \(\opt(\varphi')<1-\e'\) for \(\e'=t \e / \poly(W,d)\)
  • Efficiency: \(\lvert W' \rvert \le \lvert W \rvert^{d^t}\) and \(\lvert \varphi' \rvert\le \lvert \varphi' \rvert\cdot O(d^t)\)

Interlude: Spectral graph expansion

A graph \(G\) is \(d\)-regular if every vertex of \(G\) is an endpoint of exactly \(d\) edges of \(G\).

Let \(G\) be a \(d\)-regular graph \(G\) with \(n\) vertices. Its normalized adjacency matrix \(A_G\) is an \(n\)-by-\(n\) matrix indexed by the vertices of \(G\) such that \[ (A_G)_{u,v} = \begin{cases} 1/d & \text{if $(u,v)\in E(G)$}\,,\\ 0 & \text{otherwise.} \end{cases} \] We extend this definition to graphs with parallel edges and self-loops: We set \((A_G)_{u,v}\) to the fraction of \(u\)’s edges that are incident to \(v\).

The following observation shows how \(A_G\) acts on real-valued vectors indexed by vertices of \(G\).

Observation (averaging over neighbors). For every vector \(x\in \R^{V(G)}\) and every vertex \(u\in V(G)\), \[ (A_G x)_u = \text{average $x$-value of the neighbors of $u$}\,.\notag \]

Since the matrix \(A_G\) is symmetric and real-valued, its eigenvalues \(\lambda_1,\ldots,\lambda_n\) are real.

Claim. All eigenvalues of \(A_G\) are contained in the interval \([-1,1]\).

Proof. We use the following general bound on the largest eigenvalue in absolute value of a symmetric matrix, \[ \max_{i\in [n]} \lvert \lambda_i \rvert \le \max_{u\in V(G)} \sum_{v\in V(G)} \lvert (A_G)_{u,v} \rvert = 1\,. \]

Claim. The vector \(\mathbf{1}=(1,\ldots,1)^T\) is an eigenvector of \(A_G\) with value \(1\).

Proof. The averaging-over-neighbors observation shows that \(A\mathbf 1= \mathbf 1\).

The previous two claims show that we can order the eigenvalues of \(A_G\) such that \[ 1=\lambda_1 \ge \lvert \lambda_2 \rvert \ge \cdots \ge \lvert \lambda_n \rvert\,. \]

Definition (Spectral expansion). For every regular graph, we define its spectral expansion \(\lambda(G)=\lambda_2\) as the second largest eigenvalue in absolute value of \(G\)’s normalized adjacency matrix.

The following lemma gives a combinatorial characterization of the condition \(\lambda(G)<1\). Its proof is an exercise.

Lemma. A graph \(G\) is connected and non-bipartite if and only if \(\lambda(G)<1\).

It turns out that the spectral expansion parameter \(\lambda\) is a surprisingly useful way to quantify the “connectedness” and “non-bipartiteness” of a graph. Graphs with spectral expansion bounded away from \(1\), say \(\lambda<0.9\), behave in many ways similar to random graphs. We refer to such graphs as expander graphs.

Definition (Expander graphs). A collection \(\{G_n\mid n\in \bbN\}\) of graphs is called a family of expander graphs if there exists a constant \(\lambda<1\) such that every graph \(G_n\) in the family satisfies \(\lambda(G_n)\le \lambda\).

For every \(d\ge 3\), there exist families of \(d\)-regular expander graphs and they can be computed in polynomial time. (In fact, a random \(d\)-regular graph is an expander with high probability.)

The following property of expander graphs is useful for the analysis of our gap amplification construction. Its proof is an exercise.

Lemma (Expander mixing lemma). For every regular graph \(G\) and every set \(S\subseteq V(G)\), \[ \Pr_{uv\in E(G)}\biggl\{ u\in S \land v\in S\biggr\} \le \Pr_{u\in V(G)}\bigl\{u\in S\bigr\} \cdot \biggl(\Pr_{v\in V(G)}\bigl\{v\in S\bigr\} + \lambda (G)\biggr) \,. \]

Note that the upper bound \(\Pr_{uv\in E(G)}\{ u\in S \land v\in S\}\le \Pr_{u\in V(G)}\{u\in S\}\) holds for every graph \(G\) and every set \(S\subseteq V(G)\). The expander mixing lemma implies that this upper bound is far from tight whenever the spectral expansion of \(G\) is bounded away from \(1\) and the set \(S\) excludes a constant fraction of the vertices of \(G\).

If we choose \(S\) as a random set of size \(k\) in a regular graph \(G\) with \(n\) vertices, then \[ \E_{\substack{S\subseteq V(G)\\ \lvert S \rvert = k}} \Pr_{uv\in E(G)}\biggl\{ u\in S \land v\in S\biggr\} \approx \frac{k}{n} \cdot \frac{k}{n} \,. \] The expander mixing lemma implies that every set \(S\subseteq V(G)\) of size \(k\) satisfies \(\Pr_{uv\in E(G)}\biggl\{ u\in S \land v\in S\biggr\} \approx \frac{k}{n} \cdot \frac{k}{n}\) if \(\lambda(G)\ll \frac{k}{n}\).

Definition (path power). For a graph \(G\), its \(t\)-fold path power \(G^t\) has the same set of vertices \(V(G^t)=V(G)\) as \(G\) and for every \(t\)-step walk \(u_1,\ldots,u_t\) in \(G\), the graph \(G^t\) contains an edge between the walk endpoints \(u_1\) and \(u_t\).

If \(G\) is \(d\)-regular, then its power \(G^t\) is \(d^t\) regular. (The graph \(G^t\) may have parallel edges and self-loops.) Furthermore, the normalized adjacency matrix of \(G\) and \(G^t\) satisfy the relation, \[ A_{G^t} = (A_G)^t\,. \] Since the eigenvalues of the power of a matrix are the powers of its eigenvalues, the spectral expansion of \(G^t\) is the \(t\)-th power of the spectral expansion of \(G\).

Lemma (spectral expansion of path power). For every graph \(G\) and every \(t\in \bbN\), the \(t\)-fold path power \(G^t\) has spectral expansion \(\lambda(G^t) = \lambda(G)^t\).

This lemma shows that the path power operation improves spectral expansion. This fact underlies some of the constructions of expander graphs.

Gap amplification construction

Let \(\varphi\) be a \(d\)-regular \({2\text{-CSP}_W}\) instance with variables \(x_1,\ldots,x_n\) over alphabet \([W]\). Let \(G\) be the constraint graph of \(\varphi\), i.e., the \(d\)-regular graph on \([n]\) that has an edge between vertices \(u\in [n]\) and \(v\in [n]\) for every constraint between variables \(x_u\) and \(x_v\) in \(\varphi\). For an edge \(e\in E(G)\) between \(u\) and \(v\), let \(\varphi_e\from [W]\times [W] \to \{0,1\}\) be the corresponding constraint in \(\varphi\). Then, the objective value of an assignment \(x\in [W]^n\) for \(\varphi\), denoted \(\varphi(x)\), satisfies \[ \varphi(x) = \Pr_{e=(u,v) \in E(G)}\biggl\{ \varphi_e (x_u, x_v) = 1 \biggr\}\,. \]

For every vertex \(u\in V(G)\), let \(N_t(u)\subseteq V(G)\) denote the \(t\)-step neighborhood of \(u\) in \(G\), i.e., all vertices that can be reached from \(u\) by a path of length at most \(t\).

Construction of \({2\text{-CSP}_W'}\) instance \(\varphi^t\).

  • variables: \(y_1,\ldots,y_n\)
  • alphabet: size \(W' = W^{d^t}\); for variable \(y_u\), we identify alphabet with an assignments to the \(t\)-step neighborhood \(N_t(u)\) so that \(y_u\) takes values in \([W]^{N(t)}\); we refer to \((y_u)_v\) as the “opinion” of \(u\) about \(v\).
  • constraints: for every edge \((u,v)\in E(G^t)\), we have the following constraint \(\varphi_{u,v}\from W^{N_t(u)} \times W^{N_t(v)} \to \{0,1\}\), \[ \varphi^t_{u,v} (y_u, y_v) = 1 \Leftrightarrow \left \{\begin{aligned} & \text{the assignments $y_u$ and $y_v$ agree on $N_t(u)\cap N_t(v)$ and}\\ & \text{satisfy all constraints of $\varphi$ between variables in $N_t(u)\cup N_t(v)$} \end{aligned} \right \} \]

This construction satisfies the desired YES property (for the gap amplification theorem): If \(\opt(\varphi)=1\), then \(\opt(\varphi^t)=1\). Concretely, if \(x_1,\ldots,x_n\) is an assignment for \(\varphi\) that satisfies all constraints, then the following assignment \(y_1,\ldots,y_n\) satisfies all constraints of \(\varphi^t\), \[ \forall u,v\in [n].\quad (y_u)_v = x_v\,. \] The construction of \(\varphi^t\) also satisfies the desired efficiency property: The number of constraints in \(\varphi^t\) is at most \(d^t\) factor larger than the number of constraints in \(\varphi\). Therefore, \(\lvert \varphi^t \rvert \le \lvert \varphi \rvert \cdot O(d^t)\).

Analysis against consistent assignments

In order to finish the proof of the gap amplification theorem it remains to establish the NO property of the construction of \(\varphi^t\).

Let \(\varphi\) be a \(d\)-regular instance of \({2\text{-CSP}_W}\). Suppose that \(\opt(\varphi)<1-\varepsilon\) for some \(\varepsilon < 1/(d t)\). We are to show that \(\opt(\varphi^t) < 1- t \e / \poly(d,W)\).

As a first step toward this goal, we show that all “consistent assignment” for \(\varphi^t\) indeed achieve objective value at most \(1- t \e / \poly(d)\) as long as the constraint graph \(G\) of \(\varphi\) has graph expansion \(\lambda(G)\) bounded away from \(1\). The assumption that the constraint graph has \(\lambda(G)\) bounded away from \(1\) is without loss of generality because we can ensure this property by adding constraints to \(\varphi\) that are always satisfied and have an \(d\)-regular expander as constraint graph.

Definition. We say that an assignment \(y_1,\ldots,y_n\) is consistent if there exists an assignment \(x_1,\ldots,x_n\) such that \((y_u)_v=x_v\) for all \(u,v\in [n]\).

Lemma. Suppose the constraint graph \(G\) of \(\varphi\) satisfies \(\lambda(G)<0.9\). Then, every consistent assignment \(y\) has objective value at most \(1-t \varepsilon / 100d\) for \(\varphi^t\).

Proof of lemma

Let \(y\) be an assignment for \(\varphi^t\) consistent with assignment \(x\) for \(\varphi\). Suppose that \(x\) has objective value \(1-\eta\) for \(\varphi\). Note that \(\eta > \e\). (Recall that \(\opt(\varphi)<1-\e\).) We may also assume \(\eta < 1/ (d t)\). (Otherwise, we can upper bound the value of the assignment \(y\) in a simpler way.)

Let \(W=e(1),\ldots,e(t)\) be a random walk of length \(t\) and let \(\varphi_W^t\) be the corresponding constraint in \(\varphi^t\). Let \(X_i\) be the indicator random variable of the event that \(x\) does not satisfy the constraint \(\varphi_{e(i)}\). Let \(X=X_1 + \cdots + X_n\) be the number of constraints on walk \(W\) that are not satisfied by assignment \(x\).

Note that if \(X>0\), then \(y\) does not satisfy the constraint \(\varphi_W^t\). Therefore we can bound the objective value achieved by \(y\) in terms of the support of \(X\), \[ \begin{aligned} \Pr_{W \in E(G^t)} \biggl\{ \text{$y$ satisfies constraint $\varphi_W^t$} \biggr\} &\le 1 - \Pr \biggl\{ X > 0 \biggr\}\\ &\le 1 - (\E X )^2 / \E X^2\,. \end{aligned} \] The second step uses Cauchy–Schwarz and the fact that \(\E X \ge 0\) (see homework 3).

It remains to estimate the first two moments of \(X\).

Claim 1. \(\E X \ge t \cdot \eta\).

Proof. By linearity of expectation, \(\E X = \sum_{i=1}^t \E X_i\). Therefore, it suffices to show that \(\E X_i \ge \eta\) for all \(i\in [t]\). Since the constraint graph \(G\) is regular, the distribution of \(e(i)\) is uniform over the edges of \(G\). Therefore, \(\E X_i\) is the fraction of constraints in \(\varphi\) not satisfied by assignment \(x\), which means \(\E X_i = \eta\). \(\qed\)

Claim 2. \(\E X^2 \le 30 t d \eta\).

Proof. Let \(S\) be the set of vertices adjacent to a constraint not satisfied by \(x\). Since \(G\) is \(d\)-regular and \(x\) satisfies all but an \(\eta\) fraction of constraints, \(\Pr_{u\in V(G)}\{ u\in S\}\le d \cdot \eta\). Let \(Y_i\) be the indicator of the event that the \(i\)-th vertex \(w(i)\) of the walk \(W\) is contained in \(S\). Since \(G\) is regular, the distribution of \(w(i)\) is uniform over the vertices of \(G\). Therefore, \(\E Y_i \le d \cdot \eta\). Let \(Y=Y_1+ \cdots +Y_t\) be the number of vertices of \(W\) contained in S. Since \(X\le Y\), we can bound the second moment of \(X\) by \[ \begin{aligned} \E X^2 & \le \E Y^2 = \sum_{i} \E Y_i + 2 \sum_{i<j} \E Y_i Y_j\\ & \le t \cdot d \eta + 2\sum_{i<j} \E Y_i Y_j \,. \end{aligned} \] The second step uses linearity of expectation and the third steps uses \(\E Y_i \le d \eta\). Note that \(Y_i Y_j\) is the indicator of the event \(\{ w(i)\in S, w(j)\in S\}\). Since \(G\) is regular, the distribution of \(w(i), w(j)\) is uniform over edges of \(G^{j-i}\) (endpoints of walks of length \(j-i\) in \(G\)). Thus, by the expander mixing lemma, \[ \begin{aligned} \E Y_i Y_j & \le d \eta \cdot \biggl( d \eta + \lambda(G^{j-i}) \biggr)\\ & = d \eta \cdot \biggl( d \eta + \lambda(G)^{j-i} \biggr) \end{aligned} \] The second step uses our previous lemma about the spectral expansion of power graphs.

Putting these bounds together shows \[ \begin{aligned} \E X^2 & \le t \cdot d \eta + 2 t^2 \cdot (d\eta)^2 + 2 t \cdot d \eta \sum_{k=1}^t \lambda(G)^k \le 30 t d \eta \end{aligned} \] The last step uses the assumption \(t d \eta <1\) and that \(\sum_{k=1}^\infty \lambda(G)^k\le 10\) (because \(\lambda(G)<0.9\)). \(\qed\)

Modified construction (easier analysis)

So far we know that the construction of \(\varphi^t\) satisfies the desired NO property if we restrict ourselves to consistent assignments. We modify the construction slightly in order to simplify the analysis against all assignments.

The modified construction of \(\varphi^t\) is the same as before except that we assign different weights to the new constraints \(\varphi^t_{u,v}\). The modified construction has a parameter \(\delta >0\). (We can choose \(\delta = 1/(4W)\).)

Weights of constraints in modified construction. The weight of the constraint \(\varphi^t_{u(1),u(2)}\) in \(\varphi^t\) is equal to the probability of \(u(1)\) and \(u(2)\) being generated by the following randomized algorithm:

  • choose a random path \(v(1),\ldots,v(t_0)\) of length \(\delta t\) in the constraint graph \(G\) of \(\varphi\)
  • choose integers \(t_1,t_2\) uniformly at random from interval \([t/2,t]\)
  • choose a random path \(P_1\) from \(v(1)\) of length \(t_1\) and a random path \(P_2\) from \(v( t_0)\) of length \(t_2\)
  • let \(u(1)\) be the endpoint of \(P_1\) and \(u(2)\) be the endpoint of \(P_2\)
Generation of modified constraint distribution. A random walk of length t_1 + \delta t + t_2, where t_1 and t_2 are chosen uniformly between t/2 and t
Generation of modified constraint distribution. A random walk of length \(t_1 + \delta t + t_2\), where \(t_1\) and \(t_2\) are chosen uniformly between \(t/2\) and \(t\)

Analysis against all assignments

Let \(y\) be an arbitrary assignment for \(\varphi^t\) (not necessarily consistent). We are to upper bound the objective value of \(y\).

Our strategy is to construct an assignment \(x\) for \(\varphi\) from the assignment \(y\) for \(\varphi^t\). We then use the constraints violated by \(x\) in order to bound the objective value of \(y\) for \(\varphi^t\).

Plurality assignment obtained from \(y\). Let \(x\) be the assignment for \(\varphi\) such that for every vertex \(v\), we choose \(x_v\) as the most likely output of the following randomized algorithm:

  • choose an integer \(t'\) uniformly at random from the interval \([t/2,t]\)
  • choose a random path \(P'\) from \(v\) of length \(t'\)
  • let \(u'\) be the endpoint of \(P'\)
  • output \((y_{u'})_v\in [W]\), that is, the opinion of \(u'\) about \(v\)

Note that if \(y\) is consistent with an assignment \(x\) for \(\varphi\), then \(x\) is also the plurality assignment constructed above.

Let \(\eta>0\) be the fraction of constraints not satisfied by the plurality assignment \(x\). For every \(i\in [\delta t]\), let \(X_i\) be the indicator random variable such that \(X_i=1\) if and only if all of the following conditions are satisfied

  • the constraint between \(v(i-1)\) and \(v(i)\) in \(\varphi\) is not satisfied by the plurality assignment \(x\)
  • the opinion \((y_{u(1)})_{v(i-1)}\) of \(u(1)\) about \(v(i-1)\) differs from the plurality assignment \(x_{v(i-1)}\)
  • the opinion \((y_{u(2)})_{v(i)}\) of \(u(2)\) about \(v(i)\) differs from the plurality assignment \(x_{v(i)}\)

Note that if \(X_i=1\), then the assignment \(y\) does not satisfy the constraint between \(u(1)\) and \(u(2)\) in \(\varphi^t\). Let \(X=X_{1} + \cdots + X_{\delta t}\). As before, we can lower bound the fraction of constraints not satisfied by \(y\) in terms of the first two moments of \(X\), \[ \Pr\biggl\{ \varphi^t_{u(1),u(2)}\bigl(y_{u(1)},y_{u(2)}\bigr) =0 \biggr\} \ge \Pr\biggl\{ X>0 \biggr\} \ge (\E X)^2 / \E X^2\,. \label{eq:moment-lower-bound} \]

It remains to estimate the first two moments of \(X\). It turns out that in order to show the bound \(\E X^2 \le 30 d t \eta\) essentially the same proof works as before. However the lower bound on \(\E X\) requires new ideas. In particular, we will use basic properties of the statistical distance between random variables (related to the total variation distance of probability distributions).

Definition. For (finitely-supported) random variables \(A\) and \(B\), let \(\Delta(A,B)\) be their statistical distance, \[ \Delta\bigl(A,B\bigr) = \sum_{\omega \in \mathrm{support}(A) \cup \mathrm{support}(B)} \bigl\lvert \Pr\{ A = \omega\} - \Pr\{ B = \omega\} \bigr\rvert/2 \]

Claim 3. For every \(i\in [t]\), we have \[ \begin{gathered} \Pr \biggl\{ \bigl(y_{u(1)}\bigr)_{v(i-1)} = x_{v(i-1)} \land \bigl(y_{u(2)}\bigr)_{v(i)} = x_{v(i)} \biggm \vert v(i-1), v(i) \biggr\} \ge \left (\frac 1 W - 2\delta\right)^2\\ \end{gathered} \]

Proof. Since \(u(1)\) and \(u(2)\) are statistically independent conditioned on \(\{v(i-1), v(i)\}\), it is enough to show that each of the events \(\{(y_{u(1)})_{v(i-1)} = x_{v(i-1)}\}\) and \(\{(y_{u(2)})_{v(i)} = x_{v(i)}\}\) have probability at least \(1/W - 2 \delta\) conditioned on \(\{v(i-1), v(i)\}\). By symmetry between \(u(1)\) and \(u(2)\), it remains to show that \[ \Pr \biggl\{ \bigl(y_{u(1)}\bigr)_{v(i-1)} = x_{v(i-1)} \biggm \vert v(i-1), v(i) \biggr\} \ge \frac 1 W - 2\delta\,. \label{eq:almost-plurality-prob} \]

Let \((t',P',u')\) be the random variables as defined in the construction of the plurality assignment for \(x_{v}\) with \(v=v(i-1)\). Since \(x_{v(i-1)}\) is the most likely value for \((y_{u'})_{v}\) and there are only \(W\) possible outcomes, we have \[ \Pr \biggl\{ \bigl(y_{u'}\bigr)_{v(i-1)} = x_{v(i-1)} \biggm \vert v(i-1) \biggr\} \ge \frac 1 W \,. \label{eq:plurality-prob} \]

Conditioned on v(i-1) the vertex u(1) is the endpoint of a random walk of length i-1+t_1 from v(i-1) and u' is the endpoint of random walk of length t' from v(i-1). The plurality assignment for v(i-1) is the most likely opinion of u' about v(i-1).
Conditioned on \(v(i-1)\) the vertex \(u(1)\) is the endpoint of a random walk of length \(i-1+t_1\) from \(v(i-1)\) and \(u'\) is the endpoint of random walk of length \(t'\) from \(v(i-1)\). The plurality assignment for \(v(i-1)\) is the most likely opinion of \(u'\) about \(v(i-1)\).

We connect the bound \(\eqref{eq:plurality-prob}\) to the target \(\eqref{eq:almost-plurality-prob}\) by estimating the statistical distance between \(\bigl(y_{u'}\bigr)_{v(i-1)}\) and \(\bigl(y_{u(1)}\bigr)_{v(i-1)}\) conditioned on \(v(i-1)\), \[ \begin{aligned} \Delta\biggl(\bigl\{(y_{u(1)})_{v(i-1)} \bigm \vert v(i-1)\bigr\}, \bigl\{(y_{u'})_{v(i-1)} \bigm \vert v(i-1)\bigr\} \biggr) & \le \Delta\biggl(\bigl\{u(1)\bigm \vert v(i-1)\bigr\}, \bigl\{u' \bigm \vert v(i-1)\bigr\} \biggr)\\ & \le \Delta\biggl(t_1 + i-1, t' \biggr)\\ & \le \frac{{(i-1)}}{t/2}\,. \end{aligned} \] It is instructive to verify the above computation first for the case \(i=1\). Here the statistical distance is indeed \(0\) because both \(u(1)\) and \(u'\) are generated in the same way—by a random walk from \(v(0)\) whose length is uniformly distributed in \([t/2,t]\). For larger values of \(i\), the statistical distance arises because \(u(1)\) and \(u'\) are generated by random walks with differently distributed lengths. The length of the random walk for \(u'\) is still uniformly distributed in \([t/2,t]\). However, the length of the random walk for \(u(1)\) (when viewed from \(v(i-1)\)) is uniformly distributed between \(i-1+t/2\) and \(i-1+t\). The statistical distance between these two distributions for random walk lengths is \((i-1)/(t/2)\). Formally, the reasoning among uses the data processing inequality for statistical distance (see for example these lecture notes as a reference).

A plot of the probabilities of the two distribution of walk lengths. The area of the shaded region corresponds to the statistical distance.
A plot of the probabilities of the two distribution of walk lengths. The area of the shaded region corresponds to the statistical distance.

We conclude the desired bound \(\eqref{eq:almost-plurality-prob}\) by \[ \begin{aligned} & \Pr \biggl\{ \bigl(y_{u(1)}\bigr)_{v(i-1)} = x_{v(i-1)} \biggm \vert v(i-1), v(i) \biggr\} \\ & \ge \Pr \biggl\{ \bigl(y_{u'}\bigr)_{v(i-1)} = x_{v(i-1)} \biggm \vert v(i-1) \biggr\} \\ & \quad - \Delta\biggl(\bigl\{(y_{u(1)})_{v(i-1)} \bigm \vert v(i-1)\bigr\}, \bigl\{(y_{u'})_{v(i-1)} \bigm \vert v(i-1)\bigr\} \biggr) \\ & \ge 1/W - (i-1)/(t/2) \ge 1/W - 2\delta\,.\qed \end{aligned} \]

The previous claim allows us to show the desired lower bound on \(\E X\).

Claim 4. \(\E X \ge \delta t \eta \cdot \left(\frac 1 W - 2\delta\right)^2\). In particular, if we choose \(\delta = 1/(4W)\), then \(\E X \ge t \eta / (16 W^2)\).

Proof. By linearity of expectation, \(\E X = \sum_i \E X_i\). Thus, it suffices to show that every \(i\) satisfies \(\E X_i\ge \eta / (100 W^2)\). Indeed, by the definition of \(X_i\), \[ \begin{aligned} \E X_i & = \Pr\biggl\{ \varphi_{v(i-1),v(i)}(x_{v(i-1), x_{v(i)}})=0 \biggr\} \\ & \quad \cdot \Pr \biggl\{ \bigl(y_{u(1)}\bigr)_{v(i-1)} = x_{v(i-1)} \land \bigl(y_{u(2)}\bigr)_{v(i)} = x_{v(i)} \biggm \vert v(i-1), v(i) \biggr\} \\ & \ge \Pr\biggl\{ \varphi_{v(i-1),v(i)}(x_{v(i-1), x_{v(i)}})=0 \biggr\} \cdot \left(\frac 1 W - 2\delta\right)^2\\ &= \eta \cdot \left(\frac 1 W - 2\delta\right)^2 \end{aligned} \] The first inequality is by Claim 3. The last step uses that \(\eta\) is the fraction of constraints not satisfied by the plurality assignment \(x\) in \(\varphi\).

The same proof as for Claim 2 bounds the second moment of \(X\) (again under the mild assumption that \(\lambda(G)<0.9\)).

Claim 5. \(\E X^2\le 30 d t \eta\).

At this point, we have all components of the proof of the gap amplification theorem.

Proof of gap amplification theorem

Choose the parameter of our (modified) construction of \(\varphi^t\) as \(\delta = 1/(4W)\). By \(\eqref{eq:moment-lower-bound}\), we can lower bound the fraction of constraints not satisfied by assignment \(y\) in \(\varphi^t\) \[ \begin{aligned} \Pr\biggl\{ \varphi^t_{u(1),u(2)}\bigl(y_{u(1)},y_{u(2)}\bigr) =0 \biggr\} & \ge \frac{(\E X)^2 }{ \E X^2}\\ & \ge \frac { \biggl(t \eta / (16 W)\biggr)^2}{ 30 t d \eta } \\ & = t \eta \cdot \frac 1 {7680 d W^2} \ge t \e \cdot \frac 1 {\mathrm{poly}(d,W)}\,. \end{aligned} \] The second step uses Claim 4 and Claim 5.

It follows that our construction has the desired NO property: whenever \(\opt(\varphi)<1-\e\) for some \(\e>1/(d W)\), the constructed instance \(\varphi^t\) satisfies \(\opt(\varphi^t)< 1-\e'\) for \(\e'=t \e / \poly(d,W)\). \(\qed\)