Cryptography for Blockchains beyond ECDSA and sha256: Signatures and Zero-Knowledge Proofs
Speakers: Benedikt Bünz
Transcript By: Bryan Bishop
Tags: Proof systems
Media: https://www.youtube.com/watch?v=LDF8bOEqXt4&t=10570s
Am going to talk about cryptography relevant to blockchains. I’ll talk about signatures and zero-knowledge proofs. It has a lot of applications to bitcoin. It turns out that bitcoin is pretty simple in terms of cryptography. It just uses a hash function (sha256) and,separately, ECDSA for signatures. I will be mentioning some other cryptography that bitcoin could use.
Let’s start with signatures. What do signatures do? If Bob wants to say, on every e-cash system, Bob says he is going to send 1 coin to Alice, but the problem is how does Alice know whether Bob said this or was this just from someone else or how does a third person verifyin miner check that actually Bob issued this transaction? Well, what we do is use somethin called authorization. Bob signs the message with a digital signature. This signature has properties like it’s unforegeable. So let’s look at a more formal definition.
A signature scheme has three algorithms: key generation, signing something that takes the message and the private key, and then the verification algorithm that uses the public key (without access to the private key) and the signature and the message and then outputs true or false. You obviously want the one important property which is that if the signature is created ocrrectly, then the verifier will say this is the correct signature. The security is that even if I see many different signatures, then nobody should be able to forge a new signature on a new message. I might be able to reuse a signature on an old message, but I cannot sign a different transaction, unless I have the secret key.
In general, this is what it looks like. You first send the public key, and then you send a signature and message and then Alice verifies it. You can use one public key, or in some cases you can use public key multiple times but perhaps you shouldn’t. Bitcoin uses ECDSA and it’s a complex protocol and it was designed like that because of a patent conflict with Schnorr signatures, but that patent has expired. ECDSA kind of took over the internet and the market because nobody wanted to use a patent-encumbered signature schemes.
One of the many problems with ECDSA is that it’s malleable. if you remember the security definition, it said you cannot create a signature on a new message. But you can create a new signature on an old message. In ECDSA, you can take one signature on one message and create a different signature on that same message. Well why would this matter? It’s like using a different color for your signature. But in bitcoin this causes transaction malleability problems. So suddenly you have a new txid, it’s different, but still valid. It doesn’t send money to different people, but it causes lots of other problems because you can’t predict the final txid is going to be, and transactions spend coins based on txid. MtGox got fooled on this because they were issuing withdrawal transactions and then somehow they would check an hour later to see whether it was accepted, they would check with “txid” and if it wasn’t they would send out the money again. Someone figured this out and just issued withdrawal transactions by changing the signatures, and mtgox would send the money again and again and again. This is one of the ways that mtgox lost a lot of money. So security matters.
Let’s talk about some different signature schemes. I am going to go through discrete log and pairings. You have probably seen the discrete log problem ,and it basically just says that g and g^x it’s hard to produce x. It’s basically impossible unless you’[re a quantum computer. So the other preliminary is this so-called pairing. A pairing is this construct which takes two things from two elements from this group and it takes g^a and g^b and then you can pair it and map it to an element in the so-called target group. This pairing has this property that the pairing of g^a and g^b is equivalent to the pairing of g g and a times b. This is bilinear math. You can move the exponents around. If you are ever bored, then please look for a trilinear mapping, and it would make a lot of cryptographers happy, and we don’t know how to do that yet. Where you can pair three things together and multiply exponents.
This also implies some other properties– if you pair G and u * v, this is the same as pairing … so what can we do with this? So we can build so called BLS signatures. We need this elliptic curve… only a few have pairings. The secp256k1 curve in bitcoin is not pairing-friendly. And then we have a generator G, and a hash function that hashes into G in this elliptic curve.
How does the signing work? You hash the message into the group element and then raise it to the power of x. So this looks like a signature, but hwo do we verify it? How an we use G, the public key, and the signature and the message to verify it? I am going to give you the firs tpart of this, the signature with G, and anyboy want to take a stab at this? I think it’s too big of an audience for this. What can we do with this? This is atually, if you think about it, this is equivalent to pairing the hash of the message with the upublic key. And so let’s look at this in more detail. What is sigma? It was H(m) to the x. So this statement is the same as H(m^x) but we can move this x exponent around because this is a pairing, it’s the property that a pairing gives us. And then we get the final verification. This has a couple of nice properties.
One of the first ones is that the signature is very short. It’s a single group element, about 32 bytes depending on how much security you want. This is one nice signature. Another nice signature is that it’s deterministic. Given a message and a hash, the signature is just the hash of the message raised to the power x. There is no randomness like in ECDSA or Schnorr signatures. There is some work on definities to use this as creating a randomness beacon. If I have a public key and a message, there will only ever be one signature for that message. Malleability is not an issue, because it’s deterministic.
The other really cool property that BLS has is signature aggregation. Say I have two different signatures from two different people and ompletely separate signatures. What if I just multiplied them together? Sigma 1 times sigma 2. Can I still verify that, given only sigma, and not sigma 1 and sigma 2? Let’s remember this is the normla pairing, signature verification for BLS… but what if we just kind of do the thing on the left? What does this expand to? So you can verify the first signature and the second signature, and this is the final equation over here for the whole pairing. Why does this matter?
Why do we care about signature aggregation? Welll, tthink about it. say we have m different messages and they are not related. But they have different signatures. Now we can take all the signatures, smoosh them togethre and get a single signature on all the transactions. You just multiply them together. It’s a single signature.
What’s another nice property we might want from signatures? The thing about these signatures and things in bitcoin that we just heard a lot about it. There’s this different property that is stronger than multisig, it’s called threshold signatures. Say we have m different parties and we wanted to do this keygen to generate their own keys and we wanted a threshold of them so that t can sign a message. The important thing is that the signature is does not grow with the threshold. It’s still a single signautre. It’s different from multisig where it’s literally just concatenating the signatures together. So if you have a thousand out of 500, 500 of 1000 multisig, then you need to concatenate 500 signatures together. But in threshold signatures, you have a single signature that looks like one person signed it. This is good for privacy because the amount only reveals.. whether it was multisig or just a threshold signature. This is very difficult for ECDSA but there’s been some work on … in 2016.. Gennano et al 2016. It’s very easy in Schnorr signatures. You can do this with 1k out of 2k keys, or even 3 out of 1k keys, etc. It will still look like a normal signature, the blockchain doesn’t need CHECKMULTISIG, just needs CHECKSCHNORRSIG or CHECKBLS or something.
It’s basically- you build this polynomial and every party gets a different point on this polynomial and then if you have enough people, if you have enough threshold they will be able to reconstruct the polynomials just from the points. And then the secret key is just this special point on the polynomial evaluated to zero.
The next signature type that I might be interested is the ring signature. It looks similar to a threshold signature but it’s actually idfferent. We have a set of parties, forming a so called ring, which is where the name comes from. One of them, once you create a signature, nobody should be able to tell who of them signed the message. So there’s no linkability. This came up in early e-cash protocols. If you ever want to use ring signatures, probably use the logarithmic one.
I’ll skip over blind signatures for today.
I want to go straight to zero-knowledge proofs of knowledge. It’s a very hyped topic. There’s also zk-SNARKs and NIZKs. What are the differences? What is a zero-knowledge proof of knowledge? It works as follows. Alice says she knows a solution to a complex euqation, or the decyrption of an encryption that has some properties, some knowledge. I know this, and it has some properties. And then Bob says, prove it. And one solution obviously to prove it is to give the solution. You can just give the private key for example. But what if you don’t want to reveal the solution? Is it possible? Surprisingly, I can convince you that I know the solution, without giving you any information about what the solution is. The way that we do that is through this challenge response protocol.
So you ask me questions that I would not be able to answer if I didn’t know a solution. However, the answers to these questions reveals no information about the solution. This can be in multiple rounds. In the end, Bob will have no idea what the solution is, but Alice must know it, otherwise she would not have been able to answer these questions. It’s counterinuitive but it’s possible. And zero-knowledge proofs have tons of applications in bitcoin, many related to confidentiality, privacy, mimblewimble, zerocash, and so on and so forth and many more.
Let’s do a concrete example of sudoku. Say you wanted to do a zero knowledge contingent payment based on a sudoku puzzle solution, where you want the payment in exchange for the solution. So first we prove that I have the solution, and once you’re convinced, you pay me on condition that I reveal the solution. And the payment is in the blockchain and it’s irrevocable.
There is a zero-knowledge sudoku paper from 2005 (Gradwohl et al., 2005).
https://bitcoincore.org/en/2016/02/26/zero-knowledge-contingent-payments-announcement/
For a non-interactive zero knowledge proof (NIZK), you need to use a hash function. The verifier does the same checks, it makes sure that the hash was computed correctly. This is actually what you’re seeing up there is schnorr signature. This is basically what a schnorr signature algorithm does, the only difference is that we also hash the message here. We hash the message because.. a signature, and a proof of knowledge of a private key are very related. To create a signature, it means I know the private key. It’s a proof-of-knowledge of a private key. The only thing we need is we need to hash this message. So non-interactive zero knowledge proofs and signatures are very closely related. And in the verification you also hash the message.
These are the so-called sigma protocols because they go zig zag zig zag with 3 messages. That’s why they are called sigma protocols. They are good for proving stuff about private keys. If you have public keys, you can do proofs on those properties, or range proofs, or commitments, like pedersen commitments. Range proofs often use something similar to the sigma protocols. Or you can do proofs of solvency where an exchange proves they have enough money to pay all their customers without giving up any information about how much money total they have or which addresses, and it does this with a non-interactive zero-knowledge proof or a sigma protocol. It’s not good for complex statements. If you have a hash function and you want to prove that H(x) is some y, hash functions are fiarly complicated, they have, like, to implement them as a circuit, they have like 20k gates. For example, if yo uwant to do a more complex proof like replace… with a zero knowledge proof.. I don’t know why you would want a zero-knowledge proof there, but you could give a proof that this blockchain is the correct one, which is a pretty complex statement, and sigma protocols are not a good fit for that. The proof size grows linearly with the size of the statement. The same thing applies in the soduko example. The larger the , the more uh, the bigger the sudoku is, I use 100x100 sudoku or 1000x1000 sudoku, then the more rounds I will need to do. The probability of catching a cheater is lower, so I need to do a lot more rounds. So this leads to the goal of we’d like to have… ocmplex statements, but succint proofs. Say 100 byte proofs, no matter what the statement is. And we need verification to be extremely fast. Surprisingly, this is possible.
Now we can come to the cryptocurrency buzzword, whic his preprocessing SNARKs, succint non-interactive arguments of knowledge. It’s not a proof of knowledge, it’s an argument of knowledge. The most important property of SNARKs is that they are succinct. They were called preprocessing SNARKs, originally. The trick is that you encrypt the queries in such a way that Alice can’t read the queries but she can still kind of answer them in a special way. They are encrypted in a way such that nobody can read them, but if you throwaway the… key that it was encrypted with to read them but you can still answer these queries and get the compressed response.
You could combine this with trusted setup where some nice guy is going to send the encrypted queries to Alice, and the short verifying is… shrt so that the answers to the queries to Bob… and hten Bob can use those to verify the proof. So onw we have a non-interactive proof because the setup needs to be done once and then you can reuse the trusted setup as often as you want to. And you can use the short verification key to verify the proof. So the setup and the prover are slower, they are still linear in the statement, but verifying is now very fast.
What’s the problem with trusted setup? Well it could be a malicious setup. Say it was done wrong. Then the prover can actually create false claims because he can see the queries and create proofs that fool the verifier. So if the setup guys are not honest, then he might use his special secret key, this toxic material, to create cheating proofs. So one way to go around this is to use multiple parties in the setup and this is what zcash did because they are using SNARKs.
Can we do better? Can we get rid of trusted setup? It turns out there’s some things we could do. This is one of the most amazing results in theoretical cryptography is the PCP theorem. You can take your sudoku and change it so that it’s slightly bigger.. there’s a deterministic way to change your sudoku in a way that is only a tiny bit bigger. But the amazing property of this new sudoku is that you can open two fields of the sudoku and open up this 2 and this 3 here at the bottom. And if you had, if there was a single error in the original sudoku puzzle, so you didn’t have a complete solution but a partial solution, then no matter what 2 fields I open, with probability 1/3 I am going to catch it. And it doesn’t depend on the size of the sudoku, it could even be a gigantic sudoku. So with probability 1/3 I am able to catch you cheating. So by doing this transformation of this sudoku into this other problem, using PCP thoerem, that’s how it is achieved. If the probability is 1/3 then no matter the size, then with amplification, I only have to look at a very– at a constant number of fields, no matter how large the sudoku puzzle is, or the statement you are trying to prove. The proofs are short. And there’s no trusted setup. Currently this is interactive but you can do something– I am skipping over details– similar to the heuristic.. you use the hash of the first round to make this into something called a computationally-sound proof (or a “CS proof”) which is non-interactive. So now we have this beautiful thing with short proofs, no trusted setup. Why isn’t zcash using this? Well, it’s very theoretical, it’sbeen around since 1991 but it’s still not really practical. The small blow-up is really large in practice. The prover is really slow. We’re just not quite there yet. But there’s some very recent work by the … Stars… STARKs (Ben-Sasson 2017) which was introduced earlier this year which is the first or one of the first feasible implementations and it’s somewhat feasible… 130 GB ram usage, 1.8 MB proofs.
Bulletproofs: short proofs, just like SNARKs or STARKs, they are short proofs but it has linear verification. Been working with Greg Maxwell on this. The proofs are really short. Bunz et al. 2017 based on Bootle et al. 2016. Proofs are very short log(n) for statement of size n. Verification is like proving (slow). Replacement for sigma protocol. It has long verification but it’s still useful for like range proofs, which was our original motivation, which is used in mimblewimble and confidential transactions. Their size is significantly reduced from 4 KB to 670 bytes. Also, you can aggregate them. Two range proofs- 736 bytes instead of 8 KB. For 16 range proofs, it’s 926 bytes instead of 61 kilobytes. This is really nice for aggregation. In mimblewimble, this means the size of the blockchain drops from 160 GB to 17 GB. For confidential transactions, the utxo set size will also drop at the same rate because now your proofs are a lot smaller. It also has a built-in coinjoin protocol for combining confidential transactions. You can use it for solvency proofs, verifiable shuffles or mixnets where people have 10 transactions and they want to create 10 random outputs and nobody is able to link them together so they can use that there as well.