Semiotic Labs
Gradient background
  • cryptography

Introduction to the Sum-Check Protocol

Sam Green Sam Green

TL;DR

It is expensive to run transactions on the Ethereum EVM. Verifiable computing (VC) lets us outsource computing away from the EVM. Today, a popular and exciting form of VC algorithm is the SNARK. There are various families of SNARKs that use the sum-check protocol, which is a simple algorithm to introduce VC. This is a tutorial on the sum-check protocol. This post is focused on how the sum-check protocol is implemented - it does not go into theory. You can skip straight to the finished code here. Thank you to Gokay Saldamli, Gabriel Soule, and Tomasz Kornuta for providing valuable feedback on this article.

Background

Executing code on Ethereum is expensive. Verifiable computing algorithms promise a way to reduce costs, by outsourcing computing to untrusted parties and only verifying the result on-chain. A key point about useful verifiable computing algorithms is that the verification must be less expensive than the original computation. The sum-check protocol is a foundational algorithm in the field of verifiable computing. In a stand-alone setting, sum-check is not particularly useful. However, it is an important building block of more sophisticated and useful SNARK algorithms. This tutorial introduces sum-check as a way to familiarize the reader with a relatively simple verifiable computing algorithm.

The sum-check protocol is used to outsource the computation of the following sum:

H:=x10,1x20,1xv0,1g(x1,x2,,xv)\begin{equation*} H := \sum_{x_1 \in {0,1}} \sum_{x_2 \in {0,1}} \cdots \sum_{x_v \in {0,1}} g(x_1, x_2, \cdots, x_v) \end{equation*}

The sum-check protocol allows a Prover PP to convince a Verifier VV that PP computed the sum HH correctly. You can assume that PP has ample computational power or ample time to compute, e.g., it could be an untrusted server somewhere on the internet, and VV is has limited computational power, e.g., it could be the EVM.

Why would you want to sum a function over all Boolean inputs, as is done in the equation above? One reason would be if you wanted to provide an answer to the #SAT (“sharp sat”) problem. #SAT is concerned with counting the number of inputs to a Boolean-valued function that result in a true output. For example, what is the #SAT of f(x1,x2,x3)=x1 AND x2 AND x3f(x_1, x_2, x_3) = x_1 \text{ AND } x_2 \text{ AND } x_3? It is 1, because only a single input, x1=1x_1=1, x2=1x_2=1, x3=1x_3 = 1, results in 1 being output from ff. You can probably see that it would be computationally intensive for you to solve #SAT for complex problems. With the sum-check protocol, PP can do that work instead of you.

Unless you are a complexity theorist, you may not get too excited about the #SAT problem at first glance. However, more practical problems can be mapped to #SAT, and then sum-check can be used. For example, given an adjacency matrix, you can use sum-check to solve for how many triangles are connected. You can also use sum-check to outsource matrix multiplication. Check out the book mentioned next for more details.

In this tutorial, we use the same example as given in Justin Thaler’s book: Proofs, Arguments, and Zero-Knowledge. In Justin’s book, sum-check is used to solve #SAT for the following example function:

ϕ(x1,x2,x3,x4)=(x1ˉ AND x2) AND (x3 OR x4)\begin{equation*} \phi(x_1, x_2, x_3, x_4) = (\bar{x_1} \text{ AND } x_2) \text{ AND } (x_3 \text{ OR } x_4) \end{equation*}

where x1ˉ\bar{x_1} means NOT x1x_1. The function ϕ\phi (phi) can also be represented graphically as:

In the graph of ϕ\phi, we see the logic symbols for AND (\wedge) and OR (\vee). Note that ϕ\phi is only defined over the Boolean inputs (0/1) and that ϕ\phi has a Boolean output. However, most popular verifiable computing algorithms only support arithmetic operations, so we need to convert ϕ\phi to its arithmetic version. The arithmetized version is called gg. For our example, gg is defined as:

g(x1,x2,x3,x4)=(1x1)x2((x3+x4)(x3x4))\begin{equation*} g(x_1, x_2, x_3, x_4) = (1-x_1)x_2((x_3+x_4)-(x_3x_4)) \end{equation*}

and visualized as:

In gg, notice that the Boolean functions: NOT, AND, OR have been compiled into their arithmetic equivalents. Take a moment to convince yourself that if you pick Boolean values for x1x_1, x2x_2, x3x_3, and x4x_4, that ϕ\phi and gg will output the same thing.

For reference, here are the specific compiler rules used when converting ϕ\phi into gg:

Boolean gate      Arithmetized version
A AND BA*B
A OR B(A+B)-(A*B)
NOT A1-A

Notice that gg has extended the domain of our original Boolean formula and is now valid for Boolean inputs and integers. The facts that gg equals ϕ\phi when evaluated at Boolean inputs and that integers are valid inputs to gg are leveraged by the sum-check protocol.

The sum-check protocol

This section introduces the sum-check protocol. We will follow the same steps and naming convention as used in Justin’s book. The protocol steps are summarized as:

  1. Prover PP calculates the total sum of gg evaluated at all Boolean inputs

  2. PP computes a partial sum of gg, leaving the first variable x1x_1 free

  3. Verifier VV checks that the partial sum and total sum agree when the partial sum is evaluated at 0 and 1 and its outputs added

  4. VV picks a random number r1r_1 for the free variable and sends it to PP

  5. PP replaces the free variable x1x_1 with the random number r1r_1 and computes a partial sum leaving next variable x2x_2 free

  6. PP and VV repeat steps similar to 3—5 for the rest of the variables: x2,,xvx_2, \cdots, x_v

  7. VV evaluates gg at one input using access to an “oracle”, i.e., VV must make a singled trusted execution of g(r1,r2,,rv)g(r_1, r_2, \cdots, r_v)

If VV makes it through Step 7 without error, then VV accepts PP‘s proof. Otherwise, VV rejects.

We expand the steps of the algorithm in the following sections.

1. PP calculates the total sum of gg evaluated at all Boolean inputs

HH is the sum described in the first equation in this article. In round zero, PP should perform the summation described in the first equation. PP assigns this sum to g0g_0 and returns it to VV. Note that this sum is the #SAT solution, i.e., it gives the answer of how many unique inputs to the function result in 1 being output from gg. The remainder of the sum-check algorithm is designed to ensure that PP performed the #SAT solution correctly.

In our example, g0=3g_0 = 3 will be returned in Step 1. The following table shows PP‘s work where the must iterate over all possible Boolean combinations of x1x_1,x2x_2, x3x_3, and x4x_4 and evaluate g(x1,x2,x3,x4)=(1x1)x2((x3+x4)(x3x4))g(x_1, x_2, x_3, x_4) = (1-x_1)x_2((x_3+x_4)-(x_3x_4)):

g(0,0,0,0)=(10)0((0+0)(00))=0g(0,0,0,1)=(10)0((0+1)(01))=0g(0,0,1,0)=(10)0((1+0)(10))=0g(0,0,1,1)=(10)0((1+1)(11))=0g(0,1,0,0)=(10)1((0+0)(00))=0g(0,1,0,0)=(10)1((0+1)(01))=1g(0,1,1,0)=(10)1((1+0)(10))=1g(0,1,1,0)=(10)1((1+1)(11))=1g(1,0,0,0)=(11)0((0+1)(01))=0g(1,0,0,1)=(11)0((0+1)(01))=0g(1,0,1,0)=(11)0((1+0)(10))=0g(1,0,1,1)=(11)0((1+1)(11))=0g(1,1,0,0)=(11)1((0+0)(00))=0g(1,1,0,0)=(11)1((0+1)(01))=0g(1,1,1,0)=(11)1((1+0)(10))=0g(1,1,1,0)=(11)1((1+1)(11))=0\begin{aligned}g(0, 0, 0, 0) &= (1-0) \cdot 0 \cdot ((0 + 0)-(0 \cdot 0)) = 0\\ g(0, 0, 0, 1) &= (1-0) \cdot 0 \cdot ((0 + 1)-(0 \cdot 1)) = 0\\ g(0, 0, 1, 0) &= (1-0) \cdot 0 \cdot ((1 + 0)-(1 \cdot 0)) = 0\\ g(0, 0, 1, 1) &= (1-0) \cdot 0 \cdot ((1 + 1)-(1 \cdot 1)) = 0\\ g(0, 1, 0, 0) &= (1-0) \cdot 1 \cdot ((0 + 0)-(0 \cdot 0)) = 0\\ g(0, 1, 0, 0) &= (1-0) \cdot 1 \cdot ((0 + 1)-(0 \cdot 1)) = 1\\ g(0, 1, 1, 0) &= (1-0) \cdot 1 \cdot ((1 + 0)-(1 \cdot 0)) = 1\\ g(0, 1, 1, 0) &= (1-0) \cdot 1 \cdot ((1 + 1)-(1 \cdot 1)) = 1\\ g(1, 0, 0, 0) &= (1-1) \cdot 0 \cdot ((0 + 1)-(0 \cdot 1)) = 0\\ g(1, 0, 0, 1) &= (1-1) \cdot 0 \cdot ((0 + 1)-(0 \cdot 1)) = 0\\ g(1, 0, 1, 0) &= (1-1) \cdot 0 \cdot ((1 + 0)-(1 \cdot 0)) = 0\\ g(1, 0, 1, 1) &= (1-1) \cdot 0 \cdot ((1 + 1)-(1 \cdot 1)) = 0\\ g(1, 1, 0, 0) &= (1-1) \cdot 1 \cdot ((0 + 0)-(0 \cdot 0)) = 0\\ g(1, 1, 0, 0) &= (1-1) \cdot 1 \cdot ((0 + 1)-(0 \cdot 1)) = 0\\ g(1, 1, 1, 0) &= (1-1) \cdot 1 \cdot ((1 + 0)-(1 \cdot 0)) = 0\\ g(1, 1, 1, 0) &= (1-1) \cdot 1 \cdot ((1 + 1)-(1 \cdot 1)) = 0\end{aligned}

The outputs of the above evaluations are added to arrive at g0=3g_0 = 3.

2. PP computes a partial sum of gg, leaving the first variable x1x_1 free

In Step 2, PP computes a similar sum as they did in Step 1, except the first variable is left free/unassigned. I.e.,

g1(X1)=x2{0,1}x3{0,1}xv{0,1}g(X1,x2,,xv)\begin{equation*} g_1(X_1) = \sum_{x_2 \in \{0,1\}} \sum_{x_3 \in \{0,1\}} \cdots \sum_{x_v \in \{0,1\}} g(X_1, x_2, \cdots, x_v) \end{equation*}

Note that g1g_1 is a monomial in X1X_1, and all other variables have been assigned values in each iteration of the loop. We refer to this monomial as the “partial sum” in the next step. Also observe that the only difference between the definition for g1g_1 and the definition for HH is that the g1g_1 summation is missing x1{0,1}\sum_{x_1 \in \{0,1\}}.

For our example, g(x1,x2,x3,x4)=(1x1)x2((x3+x4)(x3x4))g(x_1, x_2, x_3, x_4) = (1-x_1)x_2((x_3+x_4)-(x_3x_4)). To calculate g1(X1)g_1(X_1), we leave x1x_1 free and then calculate the sum of gg evaluated at all possible Boolean combinations assigned to x2x_2, x3x_3, and x4x_4. The following table contains the intermediate calculations:

g(X1,0,0,0)=(1X1)0((0+0)(00))=0g(X1,0,0,1)=(1X1)0((0+1)(01))=0g(X1,0,1,0)=(1X1)0((1+0)(10))=0g(X1,0,1,1)=(1X1)0((1+1)(11))=0g(X1,1,0,0)=(1X1)1((0+0)(00))=0g(X1,1,0,0)=(1X1)1((0+1)(01))=(1X1)g(X1,1,1,0)=(1X1)1((1+0)(10))=(1X1)g(X1,1,1,0)=(1X1)1((1+1)(11))=(1X1)\begin{aligned}g(X1, 0, 0, 0) &= (1-X1) \cdot 0 \cdot ((0 + 0)-(0 \cdot 0)) = 0\\ g(X1, 0, 0, 1) &= (1-X1) \cdot 0 \cdot ((0 + 1)-(0 \cdot 1)) = 0\\ g(X1, 0, 1, 0) &= (1-X1) \cdot 0 \cdot ((1 + 0)-(1 \cdot 0)) = 0\\ g(X1, 0, 1, 1) &= (1-X1) \cdot 0 \cdot ((1 + 1)-(1 \cdot 1)) = 0\\ g(X1, 1, 0, 0) &= (1-X1) \cdot 1 \cdot ((0 + 0)-(0 \cdot 0)) = 0\\ g(X1, 1, 0, 0) &= (1-X1) \cdot 1 \cdot ((0 + 1)-(0 \cdot 1)) = (1-X1)\\ g(X1, 1, 1, 0) &= (1-X1) \cdot 1 \cdot ((1 + 0)-(1 \cdot 0)) = (1-X1)\\ g(X1, 1, 1, 0) &= (1-X1) \cdot 1 \cdot ((1 + 1)-(1 \cdot 1)) = (1-X1)\end{aligned}

After summing all the rows in the table, we get g1(X1)=3X1+3g_1(X_1) = -3X_1 + 3.

3. VV checks that the partial sum and total sum agree when the partial sum is evaluated at 0 and 1 and its outputs added

In Step 3, VV checks that PP was consistent with what they committed in Step 0 and Step 1. Specifically, VV checks the following:

g0=?g1(0)+g1(1)\begin{equation*} g_0 \stackrel{?}{=} g_1(0) + g_1(1) \end{equation*}

In our example, PP sent g0=3g_0 = 3 and g1(X1)=3x1+3g_1(X_1) = -3x_1 + 3 to VV during Steps 1 and 2, respectively. So VV must now check that 3=?(30+3)+(31+3)3 \stackrel{?}{=} (-3 \cdot 0 + 3) + (-3 \cdot 1 + 3), which it does. If this check failed, then VV would end the protocol, and PP‘s claim for g0g_0 (which is the supposed answer to the #SAT problem) would be rejected.

Note here what is happening: in Step 1, PP made a commitment to the #SAT answer. In Step 2, PP also did almost all of the work for VV, except VV still needed to evaluate the monomial g1(X1)g_1(X_1), at two points: X1=0X_1=0 and X1=1X_1=1. This is much less work than what PP must do.

Observe that at this point, VV doesn’t know whether the polynomial being evaluated by PP is actually the polynomial of interest, gg. At this point, VV only knows PP has been consistent with the polynomial they have used. In fact, everything until Step 7 is only used to verify the consistency of PP‘s answers. Step 7 resolves this uncertainty and is used to prove that PP has been both consistent and correct. This observation is one of the “slippery” aspects in understanding the sum-check protocol.

4. VV picks a random number r1r_1 for the free variable and sends it to PP

Before beginning this step, it’s time to introduce a security aspect of the protocol. When the protocol starts, VV tells PP what finite field Fp\mathbb{F_p} they will be working with. This means that all numbers will be restricted (by using the modulus operator) to be between 0 and pp. For example, if PP is told that p=16p=16 then PP will reduce everything by mod 16, e.g. 74 mod 16=1074 \text{ mod } 16 = 10 and 31x+17 mod 16=15x+131x + 17 \text{ mod } 16 = 15x + 1.

Next, VV picks a random number r1r_1 for PP to assign to X1X_1. r1r_1 is selected from the finite field Fp\mathbb{F}_p. For the running example in this article, we will pick p=16p = 16. From here on, all example calculations will be performed mod 16.

For the remainder of the protocol, PP will always replace X1X_1 with r1r_1. Note that something interesting is now happening. Up until this point, PP only evaluated gg with Boolean inputs, but now the protocol is leveraging the fact that because gg is a polynomial, it can be evaluated using arbitrary integers instead of only 0s or 1s. This is where the security of the protocol is derived: if pp is very large, it will be difficult for PP to guess the random challenges before receiving them.

For our running example, we will assume that VV selects r1=2r_1 = 2.

5. PP replaces the free variable x1x_1 with the random number r1r_1 and computes a partial sum leaving next variable x2x_2 free

PP calculates the sum again, except now they replace X1X_1 with the random number r1r_1:

g2(X2)=x3{0,1}x4{0,1}xv{0,1}g(r1,X2,x3,,xv)\begin{equation*} g_2(X_2) =\sum_{x_3 \in \{0,1\}} \sum_{x_4 \in \{0,1\}} \cdots \sum_{x_v \in \{0,1\}} g(r_1, X_2, x_3, \cdots, x_v) \end{equation*}

For our example, near the end of Step 3, PP calculated g2(X1,X2)=3X1X2+3X2g_2(X_1, X_2) = -3X_1X_2 + 3X_2. At the end of Step 4 above, we mention that VV selected r1=2r_1 = 2. So PP replaces X1X_1 with r1r_1, giving g2(X2)=3X2g_2(X_2) = -3X_2. PP returns g2(X2)=3X2 mod 16=13X2g_2(X_2) = -3X_2 \text{ mod } 16 = 13X_2 to VV.

6. PP and VV repeat steps similar to 3—5 for the rest of the variables: x2,,xvx_2, \cdots, x_v

Steps 3 through 5 above constitute one “round” of the protocol. In total, if vv is the number of variables in gg, then vv rounds in total are required to complete the protocol. Each round includes the same operations as given in Steps 3 through 5 above, with the exception that different variables are left free each round.

For example, at the end of Step 5, PP returns g2(X2)g_2(X_2) to VV. If we apply the pattern in Step 3, noting that g1(r1)=3X1+3 mod 16=13X1+3g_1(r_1) = -3X_1 + 3 \text { mod } 16 = 13X_1 + 3, we see that that VV must now check that:

g1(r1) mod 16=?(g2(0)+g2(1)) mod 16(132+3) mod 16=?(130+131) mod 1613 mod 16=13 mod 16\begin{aligned}g_1(r_1) \text{ mod } 16 &\stackrel{?}{=} (g_2(0) + g_2(1)) \text{ mod } 16 \\ (13 \cdot 2 + 3) \text{ mod } 16 &\stackrel{?}{=} (13 \cdot 0 + 13 \cdot 1) \text{ mod } 16 \\ 13 \text{ mod } 16 &\stackrel{\checkmark}{=} 13 \text{ mod } 16\end{aligned}

So round 2 passes.

In our example, gg has four variables, so we will cycle through Steps 3—5 four times, then we move to Step 7.

7. VV evaluates gg at one input using access to an “oracle”, i.e., VV must make a singled trusted execution of g(r1,r2,,rv)g(r_1, r_2, \cdots, r_v)

After round vv has successfully been checked, then gv(Xv)g_v(X_v) has incorporated r1,r2,,rv1r_1, r_2, \cdots, r_{v-1}. The last step is for VV to select rvr_v and use an oracle to evaluate g(r1,r2,,rv)g(r_1, r_2, \dots, r_v). The oracle is a trusted source which will be guaranteed to evaluate gg correctly at a single input. Alternatively, if no oracle is available, then VV must compute gg once by itself. Without an oracle, VV would be required to store and compute gg — this could be costly.

After g(r1,r2,,rv)g(r_1, r_2, \dots, r_v) is provided by the oracle, or computed by VV itself, then VV checks:

g(r1,r2,,rv)=?gv(rv)\begin{equation*} g(r_1, r_2, \dots, r_v) \stackrel{?}{=} g_v(r_v) \end{equation*}

If this check passes, then PP accepts VV‘s claim that g0g_0 is equal to HH, or, in our example, that g0g_0 is the #SAT of gg.

Conclusions and further reading

Python code implementing sum-check can be found in this repo. We omitted security checks in this tutorial for brevity, but those checks are included in the linked repo. Also, this article did not explain why sum-check is secure, nor did it cover the practical costs of the algorithm. Briefly, the security of sum-check can be analyzed using the Schwartz–Zippel lemma — it is basically like random sampling used for quality control. Regarding the costs, to solve #SAT without the sum-check protocol, VV would have had to evaluate gg at 2v2^v inputs, but with sum-check, the cost for VV drops to vv steps of the protocol plus a single evaluation of gg. If you want more depth, see Justin’s book, which is linked above, and these resources:

  1. Sum-check article by Justin Thaler: The Unreasonable Power of the Sum-Check Protocol

  2. Sum-check notes by Edge & Node cryptographer, Gabriel Soule: https://drive.google.com/file/d/1tU50f-IpwPdCEJkZcA7K0vCr7nwwzCLh/view

If you have made it this far, you now hopefully have a feel for how interactive arguments of knowledge work. This is the first time we have mentioned “interactive”. Note that PP and VV had to communicate in each round. This is not ideal in scenarios where computers can go offline, i.e., the real world. The Fiat–Shamir transform makes sum-check non-interactive. As you may know, SNARKs also eliminate interactions, as implied by the “N” in the name which stands for “Non-interactive”. SNARKs also use Fiat-Shamir. Perhaps we will make a part 2 of this article where we implement a non-interactive version of sum-check.