1 Introduction

The notion of zero-knowledge interactive proof was introduced in the seminal work of Goldwasser, Micali, and Rackoff. Since their introduction, zero-knowledge proofs have played a central role in the development of the foundations of cryptography [18]. Informally, it is a protocol between two parties, a prover and a verifier, where the prover wants to establish possession of certain knowledge without revealing the knowledge itself. Goldwasser, Micali, and Rackoff formalized this intuitive notion using the language of computational complexity theory. It is formalized as follows. A language or a promise problem L admits an interactive proof if there is a computationally unbounded prover P and a randomized polynomial-time verifier V such that for a positive instance x, V after interacting with P accepts x with high probability. On the other hand, for a negative instance x for any prover \(P^*\), the verifier V after interacting with \(P^*\) accepts x with low probability. The protocol is a statistical zero-knowledge protocol if for positive instances, the interaction between P and V can be simulated by a randomized polynomial time simulator S so that the output distribution of the simulator is statistically close to the distribution of the interaction. The intuition is that the interaction itself can be simulated efficiently (in randomized polynomial-time) and hence the verifier is not gaining any additional knowledge other than what she can simulate by herself. The class of problems that admit statistical zero-knowledge interactive proofs is denoted by SZK. An important restriction is when the output distribution of the simulator is identical to the distribution of the interaction. Such a protocol is called a perfect zero-knowledge protocol, and the corresponding class of languages is denoted by \(\mathrm{ PZK}\) [17]. It is also possible to envision a non-interactive situation where the only communication in the protocol is from the prover to the verifier. Indeed, Blum, Feldman and Micali [6] and De Santis et al. [7] investigated such non-interactive zero-knowledge proofs and introduced the class \(\mathrm{ NISZK}\). The corresponding perfect zero-knowledge class \(\mathrm{ NIPZK}\) was first investigated by Malka [21].

Zero-knowledge proofs and corresponding classes have played a key role in bridging computational complexity theory and cryptography. Several computationally hard problems that are not known to be NP-complete, including Graph Isomorphism, Quadratic Residuosity, and certain lattice problems, admit zero-knowledge proofs (some non-interactive and some perfect zero-knowledge) [12,13,14, 23]. Also, several cryptosystems are based on the computational hardness of some of these problems. While these problems are computationally hard in the sense that they lack efficient algorithms, it is interesting that they are unlikely to be \(\mathrm{ NP}\)-complete. Establishing the relationships among zero-knowledge classes and proving traditional complexity class (such as \(\mathrm{ PP}\) and sub-classes of the Polynomial Hierarchy) upper bounds for them continues to be a research focus in complexity theory and cryptography [3, 11, 22, 24]. While there are a few unconditional upper bound results, most of the results establish the hardness of proving upper bounds in the form of oracle results. We briefly discuss them below.

Unconditional Upper Bounds: Two main early upper bound results are that \(\mathrm{ SZK}\) is closed under complement [22, 24] and \(\mathrm{ SZK}\) is upper bounded by \(\mathrm{ AM}\cap \mathrm{ coAM}\) [3, 11, 22, 24]. The latter result implies that NP-complete problems cannot have statistical zero-knowledge proofs unless it contradicts the widely held belief that the Polynomial Hierarchy is infinite [9]. The relationship between zero-knowledge classes and traditional probabilistic complexity classes has also been explored recently. In particular, Bouland et al. show that all problems with perfect zero-knowledge proofs admit unbounded probabilistic polynomial-time algorithms (that is, \(\mathrm{ PZK}\subseteq \mathrm{ PP}\)) [10].

Relativized Separations: Several significant upper bound questions, including whether various zero-knowledge classes are closed under complement and whether statistical zero-knowledge classes equal the corresponding perfect zero knowledge classes, turned out to be difficult to resolve and led to oracle separation results. Lovett and Zhang showed that the class \(\mathrm{ NISZK}\) is not closed under complement in a relativized world [20]. Bouland et al., in the same paper where they show \(\mathrm{ PZK}\subseteq \mathrm{ PP}\), established a comprehensive set of oracle separation results. In particular, they showed that there are relativized worlds where \(\mathrm{ NISZK}\) (and thus \(\mathrm{ SZK}\)) is not in \(\mathrm{ PP}\), \(\mathrm{ PZK}\) does not equal \(\mathrm{ SZK}\), and \(\mathrm{ NIPZK}\) and \(\mathrm{ PZK}\) are not closed under complement.

Our Contributions

One of our main contributions is a new unconditional upper bound on the complexity class \(\mathrm{ NIPZK}\).

Theorem 1

\(\mathrm{ NIPZK}\subseteq \mathrm{ CoSBP}\).

The class \(\mathrm{ SBP}\) (Small Bounded-error Probability) was introduced by Böhler, Glaßer, and Meister [8] and is a bounded-error version of \(\mathrm{ PP}\). Informally, a language is in \(\mathrm{ PP}\) if there is a probabilistic polynomial-time machine for which the ratio between the acceptance and the rejection probabilities is more than 1 for all the positive instances. We obtain the class \(\mathrm{ SBP}\) when we stipulate that this ratio is bounded away from 1, i.e, \((1+ \epsilon )\) for a fixed constant \(\epsilon > 0\). This restriction greatly reduces the power of the class. In particular, it is known that \(\mathrm{ SBP}\) is a subset of \(\mathrm{ AM}\) and \(\mathrm{ PP}\), and contains \(\mathrm{ MA}\). This class has been studied in other contexts as well, such as in circuit complexity and quantum computation [1, 2, 5, 19]. Even though the relationship between \(\mathrm{ SBP}\) and zero knowledge classes has not been studied earlier, a curious connection exists between them. Watson showed that a certain promise problem regarding the min-entropy of samplable distributions is a complete problem for \(\mathrm{ SBP}\) [25]. Interestingly, the analogous problem where entropy instead of min-entropy is considered was shown to be complete for the class \(\mathrm{ NISZK}\) [16]. Our upper bound result improves the known containments \(\mathrm{ NIPZK}\subseteq \mathrm{ AM}\cap \mathrm{ coAM}\) to \(\mathrm{ NIPZK}\subseteq \mathrm{ AM}\cap \mathrm{ CoSBP}\) and \(\mathrm{ NIPZK}\subseteq \mathrm{ PP}\) to \(\mathrm{ NIPZK}\subseteq \mathrm{ CoSBP}\).

We consider the possibility of establishing other upper bounds for perfect zero-knowledge classes. Since \(\mathrm{ NIPZK}\) is not known to be closed under complement, is it possible to show that \(\mathrm{ NIPZK}\subseteq \mathrm{ SBP}\cap \mathrm{ CoSBP}\)? We also consider whether we can show that \(\mathrm{ PZK}\) itself lies in \(\mathrm{ CoSBP}\). For these two questions, we prove the following relativized lower bound results.

Theorem 2

There is an oracle O such that \(\mathrm{ NIPZK}^O\) (and thus \(\mathrm{ PZK}^O\)) is not in \(\mathrm{ SBP}^O\).

This result along with Theorem 1 implies that \(\mathrm{ NIPZK}\) is not closed under complement in a relativized world, a result that was recently established by Bouland et al. [10].

Theorem 3

There is an oracle O such that \(\mathrm{ PZK}^O\) is not in \(\mathrm{ CoSBP}^O\).

As Theorem 1 relativizes with respect to any oracle, Theorems 1 and 3 together implies an oracle that separates \(\mathrm{ PZK}\) from \(\mathrm{ NIPZK}\).

Corollary 1

There is an oracle O such that \(\mathrm{ NIPZK}^O \subsetneq \mathrm{ PZK}^O\).

Figure 1 summarizes the known relationships among perfect zero knowledge classes and other complexity classes along with the results established in this work.

Fig. 1.
figure 1

\(A \rightarrow B\) indicates that A is a subset of B, \(A \dashrightarrow B\) indicates that there is a relativized world where A is not a subset of B. and arrows indicate new results. (Color figure online)

Complexity of Distribution Testing Problems. We establish our results by investigating certain distribution testing problems: computational problems over high-dimensional distributions represented by succinct Boolean circuits. Interestingly, it turns out that versions of distribution testing problems characterize various zero-knowledge classes. The distribution testing problems are best formalized as promise problems. A promise problem is a pair of sets \(\varPi = (\varPi _{Yes}, \varPi _{No})\) such that \(\varPi _{Yes} \cap \varPi _{No} = \emptyset \). \(\varPi _{Yes} \) is called the set of ‘yes’ instances, and \(\varPi _{No}\) is called the set of ‘no’ instances. Given a Boolean circuit C mapping from m bits to n bits, the distribution sampled by C is obtained by uniformly choosing \(x \in \{0,1\}^m\) and evaluating C on x. We often use C itself to denote the distribution sampled by the circuit C.

Statistical Difference (SD): Given two distributions sampled by Boolean circuits C and D, \(\varPi _{Yes} = \{\langle C, D\rangle ~|~{ dist}(C, D) \le {1/n}\}\) and \(\varPi _{No} = \{\langle C, D\rangle ~|~{ dist}(C, D) \ge 1-1/n\}\).

Here \({ dist}\) denotes the statistical distance between the distributions. When one of the distributions is the uniform distribution, the above problem is called Statistical Difference to Uniform (SDU). The seminal work of Sahai and Vadhan showed that SD is complete for the class \(\mathrm{ SZK}\) [24] and Goldreich, Sahai and Vadhan showed that SDU is complete for \(\mathrm{ NISZK}\) [16].

Entropy Approximation (EA): Given a samplable distribution C and an integer k, \(\varPi _{Yes} = \{\langle C, k\rangle ~|~\mathcal{H}(C) \ge k+1 \}\) and \(\varPi _{No} = \{\langle C, k\rangle ~|~\mathcal{H}(C) \le k-1\}\), where \(\mathcal{H}\) is the entropy function.

Goldreich, Sahai and Vadhan showed that Entropy Approximation is complete for \(\mathrm{ NISZK}\). In the above problem, if the entropy function \(\mathcal{H}\) is replaced with the min-entropy function \(\mathcal{H}_{\infty }\), the corresponding problem is known as Min-entropy Approximation (MEA). Watson showed that MEA is complete for \(\mathrm{ SBP}\) [25]. It is interesting to note that while Entropy Approximation is \(\mathrm{ NISZK}\)-complete, the analogous Min-entropy Approximation problem is complete for \(\mathrm{ SBP}\).

To establish our results, we study variants of the above distribution testing problems. The following problem, known as Uniform, is defined by Malka and was shown to be complete for \(\mathrm{ NIPZK}\) [21].

Uniform: Given a circuit \(D:\{0,1\}^{m}\rightarrow \{0,1\}^{n+1}\), let \(D[1\dots n]\) denote the distribution of the first n bits of D and let \(D[n+1]\) denote the distribution of the last bit of D. \(\varPi _{Yes}= \{\langle D \rangle \mid D[1\dots n]=U_n, \Pr [D[n+1]= 1] \ge 2/3\}\) and \(\varPi _{No}= \{\langle D \rangle \mid |{sup}(D) \cap \{0,1\}^n1| \le 2^n / 3\}\).

Here \(U_n\) denotes the uniform distribution over n-bit strings and \({sup}(D)\) is the support of the distribution D. We obtain Theorem 1 by showing that Uniform is in \(\mathrm{ CoSBP}\).

Note that we can obtain relativized versions of the distribution testing problems by providing oracle access to the circuits involved. To obtain Theorem 2, we consider a promise problem that is a variant of Uniform.

Uniform-Or-Small: Given a distribution D, \(\varPi _{Yes}=\{\langle D \rangle \mid D = U\}\) and \(\varPi _{No}=\{\langle D \rangle \mid |{sup}(D)| \le 2^{n/2}\}\).

We show that a relativized version of this problem is not in \(\mathrm{ SBP}\). For Theorem 3, we consider a variant of SD called Disjoint-Or-Identical.

Disjoint-Or-Identical: Given two samplable distributions C and D, \(\varPi _{Yes}=\{\langle C, D\rangle ~\mid ~{sup}(C)\cap {sup}(D)=\emptyset \}\) and \(\varPi _{No}=\{\langle C, D\rangle \mid C = D\}\) (i.e, the distance between C and D is either 1 or 0).

This problem can be shown to be in \(\mathrm{ CoPZK}\). We construct an oracle relative to which this problem is not in \(\mathrm{ SBP}\). Theorems 2 and 3 show that there exist relativized worlds where \(\mathrm{ PZK}\) is neither in \(\mathrm{ SBP}\) nor in \(\mathrm{ CoSBP}\). This suggests that we cannot hope to improve the containment \(\mathrm{ PZK}\subseteq \mathrm{ PP}\) to either \(\mathrm{ SBP}\) or \(\mathrm{ CoSBP}\) using relativizable techniques.

2 Notation and Definitions

Distributions. All the distributions considered in this paper are over a sample space of the form \(\{0,1\}^n\) for some integer n. Given a distribution D, we use D(x) to denote the probability of x with respect to D. We use \(U_n\) to denote the uniform distribution over \(\{0, 1\}^n\). We consider distributions sampled by circuits. Given a circuit C mapping m-bit strings to n-bit strings, the distribution encoded/sampled by the circuit C is the distribution \(C(U_m)\). We often use C to denote both the circuit and the distribution sampled by it. Note that given access to the circuit, we can efficiently generate a sample of the distribution by evaluating C on a uniformly chosen m-bit string. For this reason, we call such distributions efficiently samplable distributions or just samplable distributions. We use \({sup}(D)\) to denote the set of strings for which \(D(x) \ne 0\).

Given two distributions C and D over the same sample space S, the statistical distance between them, denoted by \({ dist}(C, D)\), is defined as follows.

$$\begin{aligned} { dist}(C, D) = \max _{T\subseteq S} (C(T) - D(T)) = \sum _{C(x) > D(x)} (C(x) - D(x)) \end{aligned}$$

Complexity Classes. We refer the reader to the textbook by Arora and Barak [4] for definitions of standard complexity classes. For a complexity class \(\mathcal{C}\), \(\mathrm{Co}\mathcal{C}\) denotes the class of complement languages/promise problems from \(\mathcal{C}\). The class \(\mathrm{ SBP}\) was introduced in [8] and is defined as follows.

Definition 1

A promise problem \((\varPi _{Yes}, \varPi _{No})\) is said to belong to the complexity class \(\mathrm{ SBP}\) if there exists a constant \(\epsilon > 0\), a polynomial \(p(\cdot )\), and a probabilistic polynomial-time Turing Machine M such that

  1. 1.

    If \(x \in \varPi _{Yes}\) then \(\mathrm {Pr}[M\text { accepts}] \ge \frac{1+\epsilon }{2^{p(|x|)}}\)

  2. 2.

    If \(x \in \varPi _{No}\) then \(\mathrm {Pr}[M\text { accepts}] \le \frac{1}{2^{p(|x|)}}\),

\(\mathrm{ SBP}\) is sandwiched between \(\mathrm{ MA}\) and \(\mathrm{ AM}\) and is the largest known subclass of \(\mathrm{ AM}\) that is in \(\mathrm{ PP}\). In fact, it is known that \(\mathrm{ SBP}\) is contained in the class \(\mathrm{ BPP}_\mathrm{ path}\) which is a subclass of \(\mathrm{ PP}\).

Theorem 4

([8]). \(\mathrm{ MA}\subseteq \mathrm{ SBP}\subseteq \mathrm{ AM}\) and \(\mathrm{ SBP}\subseteq \mathrm{ BPP}_\mathrm{ path}\subseteq \mathrm{ PP}\).

Although we will not be using explicit definitions of zero-knowledge classes, we give necessary definitions for completeness.

Definition 2

(Non-Interactive protocol). A non-interactive protocol is a pair of functions \(\langle P,V\rangle \), the prover and verifier. On input x and random strings \(r_I, r_P\), P sends a message \(\pi =P(x,r_P,r_I)\) to V, and V computes \(m=V(x,\pi ,r_I)\). V accepts x if \(m=1\), and rejects if \(m=0\). The transcript of the interaction is the tuple \(\langle x,r_I,\pi ,m\rangle \).

Note that the above definition implies that the random string \(r_I\) is shared between the prover and the verifier.

Definition 3

(NIPZK [16, 21]). A promise problem \(\langle \varPi _{Yes}, \varPi _{No}\rangle \) is in \(\mathrm{ NIPZK}\) (Non-Interactive Perfect Zero Knowledge) if there is a non-interactive protocol \(\langle P,V\rangle \) where V runs in polynomial time, and a randomized, polynomial-time computable simulator S, satisfying the following conditions:

  • (Soundness:) For any function \(P^*\) and any \(x\in \varPi _{No},~\Pr [V \text{ accepts } ] \le 1/3\)

  • (Completeness:) If \(x\in \varPi _{Yes}, \Pr [V \text{ accepts}] \ge 2/3\)

  • (Zero Knowledge:) For any \(x\in \varPi _{Yes}\), the distribution of S(x) is identical to the distribution of the transcript generated by \(\langle P,V\rangle \) on input x.

The class \(\mathrm{ NISZK}\) (Non-Interactive Statistical Zero Knowledge) is defined similarly [16], except that we only require that the statistical distance between the distribution of S(x) and the distribution of the transcript generated by \(\langle P,V\rangle (x)\) be less than 1/p(n) for every polynomial p(n). Malka [21] showed that the promise problem Uniform is complete for the class \(\mathrm{ NIPZK}\).

Theorem 5

([21]). The promise problem Uniform is complete for \(\mathrm{ NIPZK}\).

3 \(\mathrm{ NIPZK}\subseteq \mathrm{ CoSBP}\)

For a given distribution D, let \(\mathrm{CP}(D)\) denote the collision probability: \(\mathrm{Pr}_{x,y \sim D}(x=y)\). The following lemma is folklore. See [15] for a proof.

Lemma 1

For a given distribution D over \(\{0,1\}^n\), if \({ dist}(D, U_n) \ge \epsilon \), then \(\mathrm{CP}(D) \ge \frac{1+\epsilon ^2}{2^n}\)

Theorem 1. \(\mathrm{ NIPZK}\subseteq \mathrm{ CoSBP}\)

We show the result by proving that the \(\mathrm{ NIPZK}\)-complete problem Uniform is in \(\mathrm{ CoSBP}\). We start with the following lemma.

Lemma 2

Let D be a distribution on \(n+1\) bits, and let \(T = \{ x\in \{0,1\}^n \mid x1 \in {sup}(D)\}\). Suppose that \(|T| \le 2^n / 3\) and \(\mathrm{Pr}(D[n+1]=1) =\frac{1}{3}+\epsilon \) for some \(\epsilon \ge 0\). Then \({ dist}(D[1\dots n], U_n)\) is at least \(\epsilon \).

Proof

Recall that \({ dist}(D[1\dots n],U_n) = \max _{S\subseteq \{0,1\}^n}\left| \Pr _{d\sim D[1\dots n]}[d\in S] -\Pr _{u\sim U_n}[u\in S]\right| \)

Now we prove Theorem 1 by giving a \(\mathrm{ CoSBP}\) algorithm for Uniform.

Proof

Recall the definition of Uniform: Given a circuit \(D:\{0,1\}^{m}\rightarrow \{0,1\}^{n+1}\), \(\varPi _{Yes}= \{D : D[1\dots n]=U_n, \Pr [D[n+1]= 1] \ge 2/3\}\) and \(\varPi _{No}= \{D: |{sup}(D) \cap \{0,1\}^n1| \le 2^n / 3\}\).

Consider the following randomized algorithm: Given D as input, get two samples \(d_0\) and \(d_1\) from D. If the first n bits of both \(d_0\) and \(d_1\) are the same, then accept. Else, obtain k additional samples from D, and if the last bit of all these samples is 0, then accept, otherwise reject.

If D is a ‘yes’ instance of Uniform, then the probability of accepting at the first step is \(\frac{1}{2^n}\) and the probability of accepting at the second step is at most \(\frac{1}{3^k}\), so the overall accept probability is \(\le \frac{1}{2^n}+\frac{1}{3^k}\). Suppose that D is a ‘no’ instance of Uniform. By Lemma 2, either \(D[1\dots n]\) is at least \(\frac{1}{6}\) away from \(U_n\), or \(D[n+1]\) is 1 with probability at most \(\frac{1}{2}\). Suppose that D is at least 1/6 away from the uniform distribution, then by Lemma 1, the probability that the first n bits of \(d_0\) and \(d_1\) are the same is at least \(\frac{37}{36}\frac{1}{2^n}\). Thus the algorithm accepts with probability at least \(\frac{37}{36}\frac{1}{2^n}\). Now suppose that D is less than 1/6 away from the uniform distribution. This implies that the last bit of D is 1 with probability at most 1/2. Thus in this case the algorithm accepts with probability \(\ge \frac{1}{2^k}\). Thus, a no instance is accepted with probability \(\ge \min \left\{ \frac{37}{36}\frac{1}{2^n}, \frac{1}{2^k}\right\} \). Choose \(k=n-\log (37/36)\), so that a no instance is accepted with probability \(\ge \frac{37}{36}\frac{1}{2^n}\) and a yes instance is accepted with probability \(\le \frac{1}{2^n} + \frac{3^{\log (37/36)}}{3^n}\). For large enough n, \(\frac{37}{36}\frac{1}{2^n} \ge (1+\frac{1}{40}) (\frac{1}{2^n} + \frac{3^{\log (37/36)}}{3^n})\), so this is a \(\mathrm{ CoSBP}\) algorithm for Uniform.

4 Oracle Separations

In this section, we prove Theorems 2 and 3. We first prove a general approach that can be used to construct relativized worlds where promise problems involving circuits are not in \(\mathrm{ SBP}\).

Lemma 3

Let \(\varPi =\langle \varPi _Y,\varPi _N\rangle \) be a promise problem whose instances are circuits. If there is an oracle circuit family \(\{C_n\}_{n\ge 0}\) and a constant \(c > 1\) with the following properties:

  • \(C_n\) is a oracle circuit that maps n bits to n bits and makes oracle queries only to strings of length cn.

  • There exist families of sets \(\{A_n\}_{n\ge 0},\{B_n\}_{n \ge 0} \subseteq \{0,1\}^{cn}\) such that for all n, \(C_n^{A_n}\in \varPi _Y\) and \(C_n^{B_n}\in \varPi _N\)

  • For every probabilistic polynomial-time Turing Machine M and infinitely many n, for every \(D_i \in \{A_i, B_i, \emptyset \}\), \(1 \le i < n\)

    $$\begin{aligned} \frac{\Pr [M^{(\cup _{i=1}^{n-1}D_i )\cup A_n}(C_n^{(\cup _{i=1}^{n-1}D_i )\cup A_n}) accepts]}{\Pr [M^{(\cup _{i=1}^{n-1} D_i) \cup B_n}(C_n^{(\cup _{i=1}^{n-1}D_i )\cup B_n}) accepts]} < 2, \end{aligned}$$

then there exists an oracle O such that \(\varPi ^O\not \in \mathrm{ SBP}^O\)

Proof

We first note that in this definition of \(\mathrm{ SBP}\), we can choose \(\epsilon \) to be 1 by using amplification techniques. Thus a promise problem is in \(\mathrm{ SBP}\) if there exists a polynomial \(p(\cdot )\) and a probabilistic polynomial-time machine M such that on positive instances M accepts with probability at least \(2/2^{p(n)}\) and on negative instances M accepts with probability at most \(1/2^{p(n)}\). We call \(p(\cdot )\) the threshold polynomial for M.

Let \(\{M_{i}\}_{i > 0}\) be an enumeration of the probabilistic polynomial-time machines. We consider an enumeration of tuples \(\langle M_i, j\rangle _{i>0, j >0}\). In this enumeration considering \(\langle M_i, j\rangle \) corresponds to the possibility that \(M_i\) is a \(\mathrm{ SBP}\) machine with threshold polynomial \(n^j\). We first start with an empty oracle. Let \(O_i=O\cap \{0,1\}^{ci}\). For each i, \(O_i\) will be one of \(\emptyset \), \(A_i\) or \(B_i\). Consider \(\langle M_i, j\rangle \) and let n be a length for which \(O_n\) is not yet defined and for which the inequality from the lemma holds for the machine \(M_i\). Suppose that \(M_i\) makes queries of length \(\le m\). Note that by this, we have defined \(O_i\) for all \(i<cn\), thus \(O \subseteq \{0,1\}^{<cn}\) and for every \(i < n\) \(O_i\) is either \(\emptyset \), \(A_i\) or \(B_i\). Suppose that the acceptance probability of \(M_i^{O\cup A_n}(C^{A_n})\) is less than \(2/2^{n^j}\). We set O at length cn as \(A_n\) and for all the lengths from \(cn+1\) to m the oracle O is set to be \(\emptyset \). Now \(C^{A_n}\) is a positive instance for which \(M_i\) cannot be a \(\mathrm{ SBP}\) machine with \(n^j\) as the threshold polynomial. Then we set O at length cn as \(A_n\) and move to the next tuple in the enumeration. Suppose that \(M^{O \cup A_n}_{i}(C^{A_n})\) accepts with probability at least \(2/2^{n^j}\). Now by the inequality from Lemma 3, the acceptance probability of \(M^{O \cup B_n}_{i}(C^{B_n})\) is more than \(1/2^{n^j}\). Note that \(C^{B_n}\) is a negative instance for which \(M_i\) is not a \(\mathrm{ SBP}\) machine with threshold polynomial \(n^j\). Thus we make the oracle O at length cn to be \(B_n\). It is easy to see that \(\varPi ^O\) is not in \(\mathrm{ SBP}^O\): Suppose not, and there exists a probabilistic polynomial-time machine \(M_i\) with threshold polynomial \(n^j\). When we considered the tuple \(\langle M_i, j\rangle \), we ensured that \(M_i\) does not have threshold polynomial \(n^j\) on \(C^{O_{cn}}\).

4.1 Oracle Separation of \(\mathrm{ NIPZK}\) from \(\mathrm{ SBP}\)

In this section we show that Theorem 2 cannot be improved to show that \(\mathrm{ NIPZK}\) is a subset of \(\mathrm{ SBP}\) using relativizable techniques. For this we show that the oracle version of Uniform-Or-Small is not in \(\mathrm{ SBP}\).

Theorem 6

There exists an oracle O relative to which Uniform-Or-Small is not in \(\mathrm{ SBP}^O\).

Malka [21] showed that Uniform-Or-Small is in \(\mathrm{ NIPZK}\), and this proof relativizes. Combining this with Theorem 6, we obtain Theorem 2. To prove Theorem 6, it suffices to exhibit sets \(A_n\) and \(B_n\) that satisfy the conditions of Lemma 3. We construct these sets via a probabilistic argument. We first provide a brief overview of this construction.

Remark: There is a alternate proof of the oracle separation between \(\mathrm{ NIPZK}\) from \(\mathrm{ SBP}\) which we describe here briefly. This was pointed out to us by one of the reviewers of TCC 2020. The proof uses known facts about the well-studied Permutation Testing Problem (PTP). PTP takes as input a truth table of a function \(f:[N] \rightarrow [N]\) promised to be either a permutation on [N] or N/3 away in Hamming distance from any permutation on [N]. The computational goal is to distinguish these two cases. It is known that in the query-complexity setting, there is a \(\mathrm{ NIPZK}\) protocol where the verifier uses public randomness to pick a uniform random element x from [N], which is viewed as an element from the range of the function, and the prover is required to present a preimage of x. Aaronson, in [1] (Theorem 13), gave the construction of an oracle separating \(\mathrm{ SZK}\) from the Quantum version of \(\mathrm{ SBP}\) using degree arguments. The oracle is derived from the PTP problem where the author uses a \(\mathrm{ SZK}\) upper bound for PTP. However, as noted above the upper bound of \(\mathrm{ NIPZK}\) holds for PTP and hence it gives an oracle separation of \(\mathrm{ NIPZK}\) from \(\mathrm{ SBP}\). Here we provide an oracle separation using elementary arguments.

Overview of the Proof: Consider a non-relativized world with the following restriction on how a probabilistic polynomial-time machine M can access the input circuit C: At the beginning the machine gets to see a sequence S of k independent samples from C. After this the machine ignores C. Note that in this model the underlying machine cannot perform adaptive sampling from C, nor can the machine generate samples that might be correlated. In this model it is easy to see that if C encodes the uniform distribution, the probability that M is presented with a specific sequence S of k samples is precisely \(1/2^{nk}\). Thus the probability that the machine M accepts is \(\sum _{S}\left( \frac{1}{2^{nk}} \Pr [M \text{ accepts } S]\right) \), summed over all sequences of size k.

Now given a subset D of \(\{0, 1\}^n\) of size \(2^{n/2}\), let \(U_D\) be the uniform distribution over D. Consider the following experiment. Randomly pick D and let \(C_D\) be a circuit that samples \(U_D\). Independently draw a sequence of k samples S from \(U_D\) and present them as input to M. (In a non-relativized setting, there may not be a small circuit that uniformly samples D, but in the relativized worlds we consider, this is not an issue.) We consider the acceptance probability of M over random choices of D, S and internal coin tosses of M. By a careful analysis we can show that this probability is very close to \(\sum _{S}\left( \frac{1}{2^{nk}} \Pr [M \text{ accepts } S]\right) \). Thus the ratio between the acceptance probabilities of M when given samples from the uniform distributions and samples drawn from \(U_D\) (over a random choice of D) is less than \(1+\epsilon \) for any constant \(\epsilon \). By a probabilistic argument, there exists a subset D such that the acceptance probability of M on a positive instance (U) and a negative instance (\(U_D\)) are the same. Thus M is not a \(\mathrm{ SBP}\) machine.

The crux of the above idea is that when the samples are generated independently and nonadaptively, then it is possible to argue that a \(\mathrm{ SBP}\) machine cannot distinguish between whether they came from the uniform distribution or from a distribution with small support size. Now, we need to argue in the more general model, where a probabilistic machine can do adaptive sampling and generate samples that could be correlated to each other. A first approach to construct the sets \(A_n\) and \(B_n\) is to encode the uniform distribution in \(A_n\) and the distribution \(U_D\) in \(B_n\). The set \(A_n\) can be defined as \(\{\langle i, j\rangle ~|~ \text{ the } i^{th} \text{ bit } \text{ of } \text{ the } j^{th} \text{ string } \text{ of } \Sigma ^n is 1\}\) (in the standard lexicographical ordering). To define \(B_n\) given D, first consider the multiset \(\mathbb {D}\) that contains \(2^{n/2}\) copies of each elements of D. Thus the cardinality of \(\mathbb {D}\) is \(2^n\). Now, the set \(B_n\) can be defined as tuples \(\langle \ell , j\rangle \) where the \(\ell ^{th}\) bit of the \(j^{th}\) string of \(\mathbb {D}\) is 1. Consider the oracle circuit C which is defined as follows:

Definition 4

(Oracle Circuit). Let \(C^O\) be a fixed linear-size oracle circuit, with n inputs and n outputs, defined as follows: On input \(j\in \{1\dots 2^n\}\), \(C^O(j)\) outputs \(O(\langle \ell ,j\rangle )\) for all \(\ell \) between 1 and n. In other words, \(C^O(j)\) outputs the \(j^{th}\) string of O.

Notice that \(C^{A_n}\) is the uniform distribution and \(C^{B_n}\) is uniform on D and the goal of the probabilistic machine is to distinguish between the distributions \(C^{A_n}\) and \(C^{B_n}\). However, if we allow correlated sampling, a probabilistic machine can easily distinguish \(C^{A_n}\) and \(C^{B_n}\) by computing \(C^O(j)\) and \(C^O(j+1)\) for appropriate inputs j and \(j+1\) and comparing whether they are equal or not. To guard against such behavior, we apply one more level of randomization - randomize the underlying order of the strings. Thus the tuple \(\langle \ell , j\rangle \) will encode the \(\ell ^{th}\) string in an order that is not necessarily the standard lexicographic order. We argue that when we randomly order \(\{0,1\}^n\), then adaptive and correlated sampling does not give significantly more information than independently generated samples. Now, we proceed to give a formal proof.

Detailed Proof: From now on, we fix a length n. We use a probabilistic argument to construct \(A_n\) and \(B_n\). For \(A_n\) we consider \(2^n!\) sets \(Y_i\) and define \(A_n\) to be one of them (using a probabilistic argument), and similarly for \(B_n\) we consider many sets \(N_{D_i}\) and define \(B_n\) to be one of them.

Definition 5

(Oracle families). Let \(1 \le i \le 2^n!\) index the set of all \(2^n!\) permutations of \(\{0,1\}^n\).

Oracles for Yes instances: \(Y_i = \{\langle \ell ,j\rangle \): the \(\ell ^{th}\) bit of the \(j^{th}\) string of the \(i^{th}\) permutation of \(\{0,1\}^n\) is \(1\}\).

Oracles for No instances: For each set D of size \(d=2^m\) (where \(m=n/2\) ) let \(\mathbb {D}\) be the multiset that contains \(2^{n-m}\) copies of each element of D. Thus \(|\mathbb {D}|=2^n\), and we define \(N_{D_i}\) as: \(N_{D_i} = \{\langle \ell ,j\rangle \): the \(\ell ^{th}\) bit of the \(j^{th}\) string of the \(i^{th}\) permutation of \(\mathbb {D}\) is \(1\}\).

For the rest of this section, we will use Y to represent an arbitrary \(Y_i\) oracle, N to represent an arbitrary \(N_{D_i}\) oracle, and O to represent an arbitrary \(Y_i\) or \(N_{D_i}\). Note that for every i, \(C^{Y_i}\) is the uniform distribution and \(C^{N_{D_i}}\) is the uniform distribution on D and thus has small support.

We first prove the following lemma and show later how to build on it to arrive at the conditions specified in Lemma 3.

Lemma 4

If i is uniformly chosen from \(\{1,\dots , 2^{n}!\}\) and D is uniformly chosen from all size \(2^m\) subsets of \(\{0,1\}^n\), then for any constant \(c>1\) and every probabilistic polynomial-time algorithm A, for large enough n,

$$\begin{aligned} \frac{{\Pr }_{i,r}[A^{Y_i}\text { accepts }C^{Y_i}]}{{\Pr }_{i,r,D}[A^{N_{D_i}}\text { accepts }C^{N_{D_i}}]}\le c \end{aligned}$$

where r is the random choice of A.

Without loss of generality we can assume that any oracle query that \(A^O\) makes can be replaced by evaluating the circuit \(C^O\), by modifying A in the following way: whenever A queries the oracle O for the \(i^{th}\) bit of the \(j^{th}\) string, it evaluates \(C^O(j)\) and it extracts the \(i^{th}\) bit. We refer to this as a circuit query. Let k be the number of circuit queries made by A, where k is bounded by a polynomial. We will use \(q_1,\dots q_k\) to denote the circuit queries, and denote the output \(C^O(q_i)\) by \(u_i\). We can assume without loss of generality that all \(q_i\) are distinct. We use S to denote a typical tuple of answers \(\langle u_1,\cdots , u_k\rangle \). We will use \(A_S\) to denote the computation of algorithm A when the answers to the circuit queries are exactly S in that order. Notice that the \(A_S\) does not involve any oracle queries. Once A has received S, it can complete the computation without any circuit queries. So, the output of \(A_S\) is a random variable that depends only on the internal randomness r of A.

Claim

Without loss of generality we can assume that along any random path, A rejects whenever any \(u_i=u_j\), \(i \ne j\).

Proof

In a Yes instance, \(C^Y\) is uniform. Since C has n inputs and n outputs, \(C^{Y}\) is a 1-1 function. By the earlier assumption, \(u_i\) will never match any other \(u_j\). In a No instance, \(C^N\) will have \(2^{n-m}\) inputs for any output. Rejecting any time \(u_i=u_j\) will not affect \(\Pr [A\text { accepts a Yes instance}]\), and it will reduce \(\Pr [A\text { accepts a No instance}]\). Thus the ratio of the probability of accepting an Yes instance and the probability of accepting a No instance only increases. We will show that this higher ratio is \(<c\).

We will use the following notation.

  • \(A^O\text { asks }\langle q,i\rangle \)” is the event that “the \(i^{th}\) circuit query made by A is \(C^O(q)\).” For simplicity, we write this event as “\(A^O\text { asks } q_i \).”

  • \(A^O\text { gets } \langle u,i \rangle \)” is the event that “\(C^O(q) = u\) where q is the \(i^{th}\) query”. Again, for simplicity, we write this event as “\(A^O\text { gets } u_i \).”

  • For \(S= \langle u_1,\dots u_k \rangle \), “\(A^O\text { gets }S\)” is the event that “\(A^O\text { gets }u_1\text { and }A^O\text { gets }u_2\text { and }\dots A^O\text { gets }u_k\) (in that order)”.

Lemma 5

For any probabilistic algorithm A and for any fixed \(S=\langle u_1,\dots u_k\rangle \) where all \(u_i\) are distinct,

$$\begin{aligned} \mathop {\Pr }\limits _{i,r}[A^{Y_i}\text { gets }S\text { and accepts}] = \mathop {\Pr }\limits _r[A_S\text { accepts}] \prod _{j=0}^{k-1} {\frac{1}{(2^n-j)}} \end{aligned}$$

Proof

$$\begin{aligned} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ gets } S \text{ and } \text{ accepts}]= & {} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ gets } S]\times \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ accepts } | A^{Y_i} \text{ gets } S] \\= & {} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ gets } S]\times \mathop {\Pr }\limits _r[A_S \text{ accepts}] \end{aligned}$$

The last equality is because \(A_S\) is independent of i as discussed before. We will show that \(\mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ gets } S] = \prod _{j=0}^{k-1} {\frac{1}{(2^n-j)}}\) which will prove the lemma.

$$ \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ gets } S] = \prod _{j=0}^{k-1} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ gets } u_{j+1} | A^{Y_i} \text{ gets } \langle u_1, u_2,\ldots , u_j\rangle ] $$

For any fixed j let \(E_j\) denote the event “\(A^{Y_i} \text{ gets } \langle u_1, u_2, \ldots , u_j\rangle \)”. Then,

$$\begin{aligned} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ gets } u_{j+1} | A^{Y_i} \text{ gets } \langle u_1, u_2,\ldots , u_j\rangle ]= & {} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ gets } u_{j+1} | E_j] \\= & {} \sum _{q_{j+1}} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ asks } q_{j+1}|E_j]\times \mathop {\Pr }\limits _{i,r}[C^{Y_i}(q_{j+1})=u_{j+1}|E_j] \\= & {} \sum _{q_{j+1}} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ asks } q_{j+1}|E_j]\times \mathop {\Pr }\limits _{i}[C^{Y_i}(q_{j+1})=u_{j+1}|E_j] \\= & {} \sum _{q_{j+1}} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ asks } q_{j+1}|E_j]\times {\frac{1}{(2^n-j)}}\\= & {} {\frac{1}{(2^n-j)}} \times \sum _{q_{j+1}} \mathop {\Pr }\limits _{i,r}[A^{Y_i} \text{ asks } q_{j+1}|E_j] \\= & {} {\frac{1}{(2^n-j)}} \end{aligned}$$

The third equality is because the output of C is independent of r and the fourth equality follows from the fact that for a random permutation of \(\{0,1\}^n\), once j elements are fixed, there are \(2^n-j\) equally likely possibilities for \(u_{j+1}\). The lemma follows.

Lemma 6

For any algorithm A and any fixed \(S=\langle u_1,\dots u_k\rangle \) where \(u_i\)s are distinct,

$$\mathop {\Pr }\limits _{i,r,D}[A^{N_{D_i}}\text { gets }S\text { and accepts}] = \mathop {\Pr }\limits _r[A_S\text { accepts}] \times \prod _{j=0}^{k-1}\frac{(2^m-j)2^{n-m}}{(2^{n}-j)^2} $$

Proof

The argument is identical to the proof of Lemma 5 except for the probability calculations.

$$\begin{aligned} \mathop {\Pr }\limits _{i,r,D}[A^{N_{D_i}} \text{ gets } S \text{ and } \text{ accepts}]= & {} \mathop {\Pr }\limits _{i,r,D}[A^{N_{D_i}} \text{ gets } S]\times \mathop {\Pr }\limits _{i,r,D}[A^{N_{D_i}} \text{ accepts } | A^{N_{D_i}} \text{ gets } S] \\= & {} \mathop {\Pr }\limits _{i,r,D}[A^{N_{D_i}} \text{ gets } S]\times \mathop {\Pr }\limits _r[A_S \text{ accepts}] \end{aligned}$$

The last equality is because \(A_S\) is independent of i and D. We will show that \(\mathop {\Pr }\limits _{i,r,D}[A^{N_{D_i}} \text{ gets } S] = \prod _{j=0}^{k-1} \frac{(2^m-j)2^{n-m}}{(2^{n}-j)^2}\) which will prove the lemma.

We will reuse the notation \(E_j\) for convenience. For any fixed j, let \(E_j\) denote the event “\(A^{N_{D_i}} \text{ gets } \langle u_1, u_2, \ldots , u_j\rangle \)” Then,

We will show that for any q, \(\mathop {\Pr }\limits _{i,D}[C^{Y_i}(q)=u_{j+1}|E_j] = \frac{(2^m-j)2^{n-m}}{(2^{n}-j)^2}\).

The second equality is because of the following reasoning. There are \(\left( {\begin{array}{c}2^n-j\\ 2^m-j\end{array}}\right) \) choices of D where \(u_1\dots u_j\) are included, and \(\left( {\begin{array}{c}2^n-j-1\\ 2^m-j-1\end{array}}\right) \) that include \(u_{j+1}\) as well. Given that \(u_1,\dots u_{j+1}\in D\), the probability that \(C^{N_{D_i}}(q_{j+1})=u_{j+1}\) is \(\frac{2^{n-m}}{2^n-j}\) (since there are \(2^{n-m}\) copies of \(u_{j+1}\) remaining, and \(2^n-j\) total things remaining).

We need the following claim.

Claim

For any polynomial \(k=k(n)\) and any constant \(c>1\), for large enough n,

$$\begin{aligned} \prod _{j=0}^{k-1} \frac{2^n-j}{2^n-2^{n/2}j} < c \end{aligned}$$

Proof

$$\begin{aligned} \prod _{j=0}^{k-1} \frac{2^n-j}{2^n-2^{n/2}j}\le & {} \prod _{j=0}^{k-1} \frac{2^n}{2^n-2^{n/2}j} \\= & {} \prod _{j=0}^{k-1} \frac{2^{n/2}}{2^{n/2}-j} \\\le & {} \left( \frac{2^{n/2}}{2^{n/2}-k} \right) ^k \\= & {} \left( 1+\frac{k}{2^{n/2}-k} \right) ^k \end{aligned}$$

For any polynomial \(k=k(n)\), \(\lim _{n\rightarrow \infty } (1+\frac{k(n)}{2^{n/2}-k(n)} )^{k(n)} = 1\). Hence the claim.

We can now prove Lemma 4.

Proof (Proof of Lemma 4)

From Lemmas 5 and 6, we have

$$\begin{aligned} \frac{{\Pr }_{i,r}[A^{Y_i}\text { accepts }C^{Y_i}]}{{\Pr }_{i,r,D}[A^{N_{D_i}}\text { accepts }C^{N_{D_i}}]}= & {} \frac{\sum _{S} {\Pr }_{i,r} [A^{Y_i}\text { gets } S \text { and accepts }]}{\sum _S {\Pr }_{i,r,D}[A^{N_{D_i}}\text { gets } S \text { and accepts }]}\\\le & {} \frac{\sum \limits _{\begin{array}{c} S \text { distinct} \end{array} } {\Pr }_{i,r} [A^{Y_i}\text { gets } S \text { and accepts }] }{\sum \limits _{\begin{array}{c} S \text { distinct} \end{array}} {\Pr }_{i,r,D}[A^{N_{D_i}}\text { gets } S \text { and accepts }]}\\= & {} \frac{\sum _{S\text { distinct}} {\Pr }_r[A_S\text { accepts}]\times \prod _{j=0}^{k-1} {\frac{1}{(2^n-j)}}}{\sum _{S \text { distinct}}{\Pr }_r[A_S\text { accepts}] \times \prod _{j=0}^{k-1}\frac{(2^m-j)2^{n-m}}{(2^{n}-j)^2}} \text{(by } \text{ lemmas } ~5 \text{ and } 6)\\= & {} \prod _{j=0}^{k-1} \frac{2^n-j}{2^n-2^{n/2}j} \text{( } \text{ substituting } m=n/2 )\\&\quad < c \text{(by } \text{ Claim } ~4.1) \end{aligned}$$

The second equality follows because when the oracle is \(Y_i\), S is always disjoint (as we never ask the same query twice) and when the oracle is \(N_{D_i}\) we assume that the algorithm rejects when S is not distinct.

(Completing the proof of Theorem 6): We will construct an oracle so that conditions of Lemma 3 are met. By a probabilistic argument, there exists an \(i^*\) and \(D^*\) such that

$$\begin{aligned} \frac{\Pr [A^{Y_{i*}} \text{ accepts } C^{Y_{i*}}]}{\Pr [A^{N_{D^*_{i^*}}} \text{ accepts } C^{N_{D^*_{i^*}}}}] < c \end{aligned}$$

for every \(c>1\) (by Lemma 4). Now define \(A_n\) as \(Y_{i*}\) and \(B_n\) as \(N_{D^*_{i^*}}\). This looks very close to the conditions of Lemma 3 except that we restricted the oracles to be \(A_n\) and \(B_n\), However, for Lemma 3, we require that oracles are of the form \((\cup _{i=1}^{n-1}D_i \cup A_n)\) and \((\cup _{i=1}^{n-1}D_i \cup B_n)\). To establish this, we resort to the standard techniques used in oracle constructions. Observe that the sets \(A_n\) and \(B_n\) can be constructed in double exponential time. Let \(n_1 = 2\) and \(n_j = 2^{2^{n_{j-1}}}\). We will satisfy the conditions of Lemma 3 at lengths of the form \(n_j\). For every i that is not of the form \(n_j\), we set both \(A_i\) and \(B_i\) to empty. Now \(M^{\cup _{i=1}^{n_{j-1}} D_i \cup A_{n_j}}(C_{n_j}^{\cup _{i=1}^{n_{j-1}}D_i \cup A_{n_j}})\) can be simulated using \(M^{A_{n_j}}(C^{A_{n_j}})\). As for queries whose length does not equal \(c\cdot n_j\), the machine can find answers to oracle queries without actually making the query.

4.2 Oracle Separation of \(\mathrm{ PZK}\) from \(\mathrm{ CoSBP}\)

In this section we construct an oracle that separates \(\mathrm{ PZK}\) from \(\mathrm{ CoSBP}\), thus proving Theorem 3. For this we exhibit an oracle where the promise problem Disjoint-Or-Identical is not in \(\mathrm{ SBP}\). This problem is a generalization of graph non-isomorphism (GNI) problem, in the sense that GNI reduces to this problem. Let \(G_1\) and \(G_2\) be two graphs, and let \(C_i\) be the distribution obtained by randomly picking a permutation \(\pi \) and outputting \(\pi (G_i)\). Observe that if \(G_1\) and \(G_2\) are not isomorphic then the supports of \(C_1\) and \(C_2\) are disjoint, and if \(G_1\) is isomorphic to \(G_2\), then \(C_1 = C_2\). Moreover the distributions \(C_1\) and \(C_2\) can be sampled by polynomial-size circuits. The \(\mathrm{ PZK}\) protocol for graph isomorphism can be adapted to show that Disjoint-Or-Identical is in \(\mathrm{ CoPZK}\).

Theorem 7

Disjoint-Or-Identical is in \(\mathrm{ CoPZK}\)

Theorem 3 follows from the following theorem.

Theorem 8

There exists an oracle O relative to which Disjoint-or-Identical is not in \(\mathrm{ SBP}^O\).

Input presentation: In the definition of Disjoint-Or-Identical, the input instances are tuples consisting of two circuits. However, we will represent them as just one circuit C in the following manner. Given a circuit C, let \(C_0\) denote the circuit obtained by fixing the first input bit of C to be 0, and the circuit \(C_1\) denote the circuit obtained by fixing the first input bit of C to be 1. An input to Disjoint-Or-Identical will be a circuit C and the goal is to distinguish between the cases “the support of distributions \(C_0\) and \(C_1\) are disjoint” or “\(C_0\) and \(C_1\) are identical distributions”.

The proof structure of this result is similar to that of Theorem 6 and as in that case, the goal is to construct a circuit family \(C_n\) and families of sets \(A_n\) and \(B_n\) that satisfy the conditions of Lemma 3.

Definition 6

(Oracle families). Let \(i\in \{1\dots \left( {\begin{array}{c}2^n\\ 2^{n-1}\end{array}}\right) \}\) index the partitions of \(\{0,1\}^n\) into two sets \(S_i^0\) and \(S_i^1\) each of size \(2^{n-1}\). Let \(j,k\in \{1\dots 2^{n-1}!\}\) index the possible permutations of \(S_i^0\) and \(S_i^1\), respectively.

Oracles for Yes instances: \(Y_{ijk}\) is an oracle for the set \(\{\langle 0,\ell ,m\rangle :\) the \(\ell ^{th}\) bit of the \(m^{th}\) string in the \(j^{th}\) permutation of \(S_i^0=1\}\cup \{\langle 1,\ell ,m\rangle :\) the \(\ell ^{th}\) bit of the \(m^{th}\) string in the \(k^{th}\) permutation of \(S_i^1=1\}\).

Oracles for No instances: We construct the No instances similarly, except both 0 and 1 cases query \(S_i^0\). That is, \(N_{ijk}\) is an oracle for the set \(\{\langle 0,\ell ,m\rangle :\) the \(\ell ^{th}\) bit of the \(m^{th}\) string in the \(j^{th}\) permutation of \(S_i^0=1\}\cup \{\langle 1,\ell ,m\rangle :\) the \(\ell ^{th}\) bit of the \(m^{th}\) string in the \(k^{th}\) permutation of \(S_i^0=1\}\)

An oracle of the above form will be denoted by O which is a disjoint union of sets denoted by \(O^0\) and \(O^1\). Now we define the input circuits that sample the two distributions.

Definition 7

(Oracle circuits). Let \(C^O\) be a fixed linear-size oracle circuit, with \(n+1\) inputs and n outputs, defined as follows: on input \(\langle 0,j\rangle \) where \(j \in \{1\dots 2^n\}\), \(C^O(j)\) outputs \(O^0(\langle \ell ,j\rangle )\) for all \(\ell \) between 1 and n, and on input \(\langle 1,j\rangle \) where \(j \in \{1\dots 2^n\}\), \(C^O(j)\) outputs \(O^1(\langle \ell ,j\rangle )\) for all \(\ell \) between 1 and n. In other words, \(C^O(\langle 0,j \rangle )\) outputs the \(j^{th}\) string of \(O^0\) and \(C^O(\langle 1,j \rangle )\) outputs the \(j^{th}\) string of \(O^1\).

We will establish the following lemma. Then the proof of Theorem 8 follows by arguments identical to that of the previous oracle construction.

Lemma 7

If ijk are uniformly and independently chosen from \(\{1\dots \left( {\begin{array}{c}2^n\\ 2^{n-1}\end{array}}\right) \}\), \(\{1\dots 2^{n-1}!\}, \{1\dots 2^{n-1}!\}\) respectively, then for any probabilistic polynomial-time algorithm A, for any constant \(c>1\), for large enough n,

$$\begin{aligned} \frac{{\Pr }_{i,j,k,r} [A^{Y_{ijk}}\text { accepts }C^{Y_{ijk}}]}{{\Pr }_{i,j,k,r}[A^{N_{i,j,k}} \text { accepts }C^{N_{i,j,k}}]} \le c \end{aligned}$$

We use the same notation and make the most of same simplifications from the previous construction, with the following differences. The first difference is: let h be the (polynomial) maximum number of queries made by an algorithm A for any random choice of ijkr. We will allow A to make 2h queries, two at a time, with the restriction that one must begin with 0 and the other must begin with 1. Notationally, \(p_1 \dots p_h\) are the queries that begin with 0 and \(u_i\) is the result of query \(p_i\). \(q_1\dots q_h\) are the queries that begin with 1 and \(v_i\) is the result of query \(q_i\). S is the ordered multiset \(\langle u_1, v_1, \dots u_h, v_h\rangle \). Notice that this is without loss of generality as A can simulate the original algorithm by ignoring either \(q_i\) or \(p_i\) as appropriate. The second difference is that, instead of assuming A rejects if any \(u_i\) matches any \(u_j\), we assume A rejects if any \(u_i\) matches any \(v_j\).

Lemma 8

For any probabilistic algorithm A and for any fixed \(S=\langle u_1,v_1,\dots u_h,v_h\rangle \) where all elements of S are distinct,

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}}\text { gets }S\text { and accepts }] =\mathop {\Pr }\limits _r[A_S\text { accepts}] \times \prod _{\ell =0}^{h-1}\frac{1}{(2^n- 2\ell ) (2^n - 2\ell -1)} \end{aligned}$$

Proof

Note that

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}}\text { gets }S\text { and accepts }]= & {} \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}}\text { gets }S] \times \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}}\text { accepts}~|~A^{Y_{ijk}} \text{ gets } S]\\= & {} \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}}\text { gets }S] \times \mathop {\Pr }\limits _{r}[A^{Y_{ijk}}_S\text { accepts}] \end{aligned}$$

Thus we need to prove that

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}}\text { gets }S] = \prod _{\ell =0}^{h-1}\frac{1}{(2^n- 2\ell ) (2^n - 2\ell -1)} \end{aligned}$$

We use \(E_\ell \) to denote the event \(A^{Y_{ijk}}\) gets \(\langle u_1, v_1, \cdots u_\ell , v_\ell \rangle \). Note that

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}}\text { gets }S] = \prod _{\ell = 0}^{h-1} \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}} \text{ gets } u_{\ell +1}, v_{\ell +1}~|~E_\ell ] \end{aligned}$$

and

$$\begin{aligned}&\mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}} \text{ gets } u_{\ell +1}, v_{\ell +1}~|~E_\ell ] = \sum _{p, q}\mathop {\Pr }\limits _{i,j,k,r} [A^{Y_{ijk}} \text{ asks } p_{\ell +1} \text{ and } q_{\ell +1}|E_\ell ]\\&\quad \times \mathop {\Pr }\limits _{i,j,k,r}[C^{Y_{ijk}} (p) = u_{\ell +1} \text{ and } C^{Y_{ijk}} (q) = v_{\ell +1}~|~A^{Y_{ijk}} \text{ asks } p_{\ell +1} \text{ and } q_{\ell +1},E_\ell ] \end{aligned}$$

Thus

$$\begin{aligned}&\mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}} \text{ gets } u_{\ell +1}, v_{\ell +1}~|~E_\ell ] = \sum _{p, q}\mathop {\Pr }\limits _{i,j,k,r} [A^{Y_{ijk}} \text{ asks } p_{\ell +1} \text{ and } q_{\ell +1}|E_\ell ]\\&\quad \times \mathop {\Pr }\limits _{i,j,k,r}[C^{Y_{ijk}} (p) = u_{\ell +1} \text{ and } C^{Y_{ijk}} (q) = v_{\ell +1}~|~A^{Y_{ijk}} \text{ asks } p_{\ell +1} \text{ and } q_{\ell +1},E_\ell ] \\&\quad = \frac{1}{(2^n-2\ell )(2^n-2\ell -1)} \sum _{p, q}\mathop {\Pr }\limits _{i,j,k,r} [A^{Y_{ijk}} \text{ asks } p_{\ell +1} \text{ and } q_{\ell +1}|E_\ell ]\\&\quad = \frac{1}{(2^n-2\ell )(2^n-2\ell -1)} \end{aligned}$$

Since \(\mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}}\text { gets }S] = \prod _{\ell = 0}^{h-1} \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}} \text{ gets } u_{\ell +1}, v_{\ell +1}~|~E_\ell ]\), using this with the above derived equality we obtain that

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{Y_{ijk}}\text { gets }S] = \prod _{\ell =0}^{h-1}\frac{1}{(2^n- 2\ell ) (2^n - 2\ell -1)} \end{aligned}$$

This completes the proof of the lemma.

Now we turn to the No instances.

Lemma 9

For any algorithm A, for any fixed \(S=\{u_1,v_1,\dots u_h, v_h\}\) that are all distinct,

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{N_{ijk}}\text { gets }S\text { and accepts}] = \mathop {\Pr }\limits _r[A_S \text { accepts } ] \times \prod _{\ell = 0}^{h-1} \frac{(2^n-2\ell )(2^n-2\ell -1)}{(2^{n-1}-2\ell )(2^{n-1}-2\ell -1)} \frac{1}{(2^{n-1}-\ell )^2} \end{aligned}$$

Proof

As before,

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{N_{ijk}}\text { gets }S\text { and accepts }]= & {} \mathop {\Pr }\limits _{i,j,k,r}[A^{N_{ijk}}\text { gets }S] \times \mathop {\Pr }\limits _{i,j,k,r}[A^{N_{ijk}}\text { accepts}~|~A^{N_{ijk}} \text{ gets } S]\\&\quad = \mathop {\Pr }\limits _{i,j,k,r}[A^{N_{ijk}}\text { gets }S] \times \mathop {\Pr }\limits _{r}[A^{N_{ijk}}_S\text { accepts}] \end{aligned}$$

It suffices to show that

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{N_{ijk}}\text { gets }S] = \prod _{\ell = 0}^{h-1} \frac{(2^n-2\ell )(2^n-2\ell -1)}{(2^{n-1}-2\ell )(2^{n-1}-2\ell -1)} \frac{1}{(2^{n-1}-\ell )^2} \end{aligned}$$

If \(E_\ell \) denotes the event “\(A^{N_{ijk}}\text { gets }\langle u_1, v_1, \langle , u_\ell , v_\ell \rangle \)”, then

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{N_{ijk}}\text { gets }S] = \prod _{\ell =0}^{h-1} \mathop {\Pr }\limits _{ijkr}[A^{N_{ijk} }\text { gets }u_{\ell +1}, v_{\ell +1}~|~E_\ell ] \end{aligned}$$

Now,

$$\begin{aligned}&\mathop {\Pr }\limits _{ijkr}[A^{N_{ijk}} \text { gets } u_{\ell +1}, v_{\ell +1}~|~E_\ell ] = \sum _{p, q} \mathop {\Pr }\limits _{ijkr}[A^{N_{ijk}} \text{ asks } p_{\ell +1} \text{ and } q_{\ell +1}~|~E_\ell ]\\&\quad \times \mathop {\Pr }\limits _{ijkr}[C^{N_{ijk}}(p) = u_{\ell +1} \text { and } C^{N_{ijk}}(q) = v_{\ell +1}~|~E_\ell , A^{N_{ijk}} \text{ asks } p_{\ell +1} \text{ and } q_{\ell +1}] \end{aligned}$$

Consider the event “\(C^{N_{ijk}}(p) = u_{\ell +1} \text { and } C^{N_{ijk}}(q) = v_{\ell +1}\)”, conditioned on \(E_\ell \) and “\(A^{N_{ijk}}\) asks \(p_{\ell +1}\) and \(q_{\ell +1}\)”. For this event to happen, it must be the case that both \(u_{\ell +1}\) and \(v_{\ell +1}\) are in \(S_i^0\), and \(u_{\ell +1}\) is the \(p_{\ell +1}^{th}\) element of \(S^0_i\), and \(v_{\ell +1}\) is the \(q_{\ell +1}^{th}\) element of \(S^0_i\). The probability that both \(u_{\ell +1}\) and \(v_{\ell +1}\) are in \(S^0_i\) given that \(E_\ell \) and A asks \(p_{\ell +1}\) and \(q_{\ell +1}\) is

$$\begin{aligned} \left( {\begin{array}{c}2^n-2\ell -2\\ 2^{n-1}-2\ell -2\end{array}}\right) \bigg / \left( {\begin{array}{c}2^n-2\ell \\ 2^{n-1}-2\ell \end{array}}\right) = \frac{(2^{n-1}-2\ell )(2^{n-1}-2\ell -1)}{(2^{n}-2\ell )(2^{n}-2\ell -1)} \end{aligned}$$

The probability that \(u_{\ell +1}\) is the \(p_{\ell +1}^{st}\) element given \(E_\ell \) is \(1/(2^{n-1}-\ell )\) and similarly, the probability that \(v_{\ell +1}\) is the \(q_{\ell +1}\)st element given \(E_\ell \) is \(1/(2^{n-1}-\ell )\). Thus

$$\begin{aligned}&\mathop {\Pr }\limits _{ijkr}[A^{N_{ijk}} gets u_{\ell +1}, v_{\ell +1}~|~E_\ell ]= \frac{(2^{n-1}-2\ell )(2^{n-1}-2\ell -1)}{(2^{n}-2\ell )(2^{n}-2\ell -1)} \frac{1}{(2^{n-1}-\ell )^2}\\&\qquad \sum _{p, q}\mathop {\Pr }\limits _{ijkr}[A^{N_{ijk}} \text{ asks } p_{\ell +1} \text{ and } q_{\ell +1}~|~E_\ell ]\\&\qquad = \frac{(2^{n-1}-2\ell )(2^{n-1}-2\ell -1)}{(2^{n}-2\ell )(2^{n-1}-2\ell -1)} \frac{1}{(2^{n}-\ell )^2} \end{aligned}$$

Thus

$$\begin{aligned} \mathop {\Pr }\limits _{i,j,k,r}[A^{N_{ijk}}\text { gets }S] = \prod _{\ell = 0}^{h-1} \frac{(2^{n-1}-2\ell )(2^{n-1}-2\ell -1)}{(2^{n}-2\ell )(2^{n}-2\ell -1)} \frac{1}{(2^{n-1}-\ell )^2}, \end{aligned}$$

and the lemma follows.

We need the following claim

Claim

For any polynomial \(h = h(n)\) and any constant \(c > 1\), for large enough n,

$$\begin{aligned} \prod _{\ell =0}^{h-1} \frac{(2^{n-1}-\ell )^2}{(2^{n-1}-2\ell )(2^{n-1}-2\ell -1)} < c \end{aligned}$$

Proof

$$\begin{aligned} \prod _{\ell =0}^{h-1} \frac{(2^{n-1}-\ell )^2}{(2^{n-1}-2\ell )(2^{n-1}-2\ell -1)}\le & {} \prod _{\ell =0}^{h-1} \frac{(2^{n-1}-\ell )^2}{(2^{n-1}-2\ell -1)^2}\\\le & {} \prod _{\ell =0}^{h-1} \left( 1 + \frac{\ell +1}{2^{n-1}-2\ell -1)}\right) ^2\\ \end{aligned}$$

For any polynomial h, the above expression tends to 1 for large enough n.

The rest of the proof of Lemma 7 and that of Theorem 8 is identical to the proofs of Lemma 4 and Theorem 6.