Blockchain Design Patterns
Transcript By: Bryan Bishop
Blockchain design patterns: Layers and scaling approaches
Andrew Poelstra, David Vorick
Alright. Are we ready to get going? Thumbs up? Alright. Cool. I am Andrew and this is David. We’re here to talk about blockchain design patterns: layers and scaling approaches. This will be a tour of a bunch of different scaling approaches in bitcoin in particular but it’s probably applicable to other blockchains out there. Our talk is in 3 parts. We’ll talk about some existing scaling tech and related things like caching transactions and mempool propagation. We’ll talk about some of the scaling proposals out there like taproot and erlay and other things in the news. In the third section, we talk about the more speculative bleeding edge crypto things.
Scaling: goals and constraints
So what do we mean by scaling?
It’s kind of hard to define scaling. People mean different things, like maybe processing transactions in bitcoin. There’s a lot of different stages that transactions can be at different stages and for different use cases those transactions can be in those stages for a long time. As an example, I can make a trillion one-year timelock transactions and I am not sure if it’s reasonable to claim that bitcoin can support a trillion transactions if I can do that.
One thing that scaling is not, at least for the purposes of this talk, are things like relaxing limits. Increasing the bitcoin block size, relaxing that limit, isn’t scaling. So you get 2x the throughput, for the cost of 2x all the resources. This is maybe technically possible, but not technically interesting. It has quite a serious tradeoff in terms of increasing the requirements for people to join and interact with the network. In particular, with the current limits, we’ve seen in a few talks earlier that it’s already impossible to verify the whole blockchain on your smartphone. It’s unfortunate that you can’t download the 300 GB blockchain and verify all the signatures.
I would say that when we think about scalability, one of the key restraints is– who can view our blockchain? Who can be a participant on our blockchain? These are two different things. This isn’t covered in the slides. I wanted to get across that if you can validate the chain, even if it’s too expensive for you to transact, you can verify the honesty of your bank still. So when we talk about increasing the block size or increasing resource limits, you want to make sure that you’re not excluding people from verification. In bitcoin today, that’s probably the biggest contention. When we talk about when things can scale or can’t scale, we want to make sure we don’t exclude people from being able to validate the chain. In order to do that, there’s no way to get around every user has to validate every single transaction. If you want to catch up and run a full node today, you start from the genesis block and you go through the whole history and do all the UTXOs and you validate everything. It’s a multi-day process, and then you get caught up.
A second order concern, and alternative cryptocurrencies address this differently, is the issue of miner safety. Block propagation is important. If it takes a long time for blocks to get around the network, then the miners can use selfish mining. The longer it takes for miners to talk wit heach other, there’s a longer period for bigger miners to get an advantage over smaller miners. We want the well-experienced large miners to be on the same playing field as a newer less well funded entrant to the mining ecosystem.
When we talk about scaling, there’s a few angles we’re coming from. Everything that we talk about, we have to view from the perspective of where we’re limited in the adversarial case instead of the average case. Often when we look at bitcoin network operations today, there are many optimizations that work super well when everyone is being friendly with each other. The relay network or the FIBRE network eliminates a lot of work for blocks, but we can’t use those metrics to decide that it’s safe to scale because these optimizations breakdown when the network is under attack. The limitation is, how do things perform when things aren’t going well? Continuous observation of the network to give you data right now, doesn’t give you data about how it behaves under attack right now.
Bitcoin today: UTXOs
The first part of this talk is about bitcoin today. Things that are deployed and exist in the bitcoin network today. Let me give a quick summary of how bitcoin handles coins and transactions. This is the UTXO model– unsigned transaction outputs. All bitcoins are assigned to objects called UTXOs. These are individual indivisible objects that have a spending policy, and if you know the secret key you can spend the coins. They also have an amount. To spend coins, you have to take a bunch of UTXOs that you own, you spend all of the UTXOs, and then you create new UTXOs belonging to your recipient(s). Since these are indivisible, you usually have to create an extra output for change to send it back to yourself. Other blockchains have accounts instead of UTXOs, like ethereum.
There are a number of reasons that bitcoin chose the design that it did. When you see a new transaction, you know the only way it will effect the validity of other transactions is if they both try to spend the other UTXOs because UTXOs are atomic. So you check each transaction to see if it conflicts with existing transactions. If it conflicts, then throw it away. If not, bring it into the mempool and check its validity. You never need to validate it again if you see it like in a confirmed block or whatever. But in an account based model, you don’t have this simplicity. You can have multiple transactions spending from the same account. If in aggregate some of them spend from the account, then only some will be valid and some will be invalid and you can’t tell in advance which ones will be the ones that get into the block.
There’s also things about proving properties of existence of UTXOs. In bitcoin, UTXOs get created and destroyed. Those are the only two events in the UTXO’s lifecycle. If you want to prove that a UTXO existed at some point, you provide the transaction that created it and a merkle path for the inclusion of the transaction in a blockheader. For things like blockexplorers and other historical analysis auditing applications, this is very convenient. It’s very difficult to do this in an account-based model because you have to maintain the balance of every single account at every single point of time and there’s no nice compact way to tie that back to individual events in the blockchain which makes proving things about the ethereum chain historically quite difficult.
Bitcoin today: Headers-first, transaction relay, and caching
Headers first is very primitive technology today. But when you join the blockchain, you download all the blockheaders which are 80 byte data objects and you try to find the longest chain. The chain part of the blockchain is entirely contained in 80 byte headers. All 600,000 of these are like less than 60 megs of data or something. So you can validate this chain, and then later you download the block data, which is much more data. So you start by downloading the headerchain, and it’s low bandwidth– you can get it over SMS or over the satellite, and then you know the right chain, and then you go find the blocks. Maybe you get it from less trustable sources, since you already have the headers nobody can lie to you about the blocks. The headers cryptographically commit to the blocks. In 2014, headers first got implemented. Before that, you had to join the network and get lucky about choosing peers because you wouldn’t know that someone was lying to you until you were done with the sync.
Another interesting thing is transaction caching. You see a transaction and check its validity. One thing worth mentioning here is that this is complicated by transaction malleability. We used to hear this word a lot before segwit. You can change the witness data in a transaction. Prior to segwit, if you did that, it would change the txid and it would look like a conflicting transaction and it would require re-validation. Even after segwit, it’s possible for someone to replace the witness data on a transaction. This no longer effects the validity of the transaction or the txid, but what it means is that the transaction that appears in a block might not match the transaction data you have in your system, which you should be aware of when you’re designing applications or wallets. You need to design for the adversarial case, which is something you need to think about when you are developing transaction caching.
When you see a block and everything was previously validated, then you don’t need to validate them. With compact blocks, you don’t even need to download the transaction. You get a summary of the block contents and you’re able to quickly tell oh this summary only contains things that I know about and this block is good.
One last thing I’ll mention is dandelion, which is a transaction propagation scheme. The way this works is that you send a transaction along some one-dimensional path of peers during a stem phase. Rather than flooding the transaction to the network, which means the origin node can be determined by spy nodes on the network, you do the stem phase and then a fluff phase which makes it much more difficult to do that kind of spying analysis.
Compact blocks and FIBRE
We’ve talked about compact blocks and FIBRE. People sometimes get this confused. Compact blocks is about propagating blocks to nodes in the hopes that they already have the transaction. FIBRE is about miners communicating blocks to each other, and its goal is to minimize latency. Miners learn new blocks as quickly as possible. FIBRE is super wasteful at bandwidth. It uses an error correcting code to blindly send it among as many paths as it can to sort of stream the data. As long as it sees as much data from many different nodes, it can reconstruct that block basically. So one is for nodes, one is for miners.
On-chain and off-chain state
In a moment, we’re going to talk about layer 2.
There’s a philosophical difference between the bitcoin community and what used to be the ethereum community even if they’re coming around to this idea now with eth2 and sharding. The idea is that the blockchain is not a data storage or communication layer. If you can avoid it in any way, you should not be putting data on the blockchain. Once you put data on the blockchain, it needs to be globally replicated and verified. This is very expensive to the world and to the network, and this in turn means there’s more network fees, and it creates even more spacity for space and now you have to bid for the space to get into a block quickly.
Rather than trying to extend the capabilities and the space available within bitcoin blocks and transactions, much of the innovation in the bitcoin space has been about finding creative ways to avoid using existing technology and just anchor it in a minimal way into the blockchain. The purest example of this is the opentimestamps project. Have people heard of opentimestamps? A smattering. Do people use opentimestamps? You should totally use it. It’s open-source and free. I think petertodd is still paying the fees out of his pocket. It’s cheap to run. You can commit to data laying around on your hard drive like websites you visited, emails you have received, chat logs, whatever, you produce a cryptographic hash committing to that data. None of your data leaves your system. You send this hash to the opentimestamps server, it aggregates them, it produces a merkle root committing to millions of documents. petertodd was also timestamping the entire contents of the Internet Archive. Then you commit this 32 byte hash to the blockchain, and you can commit to an arbitrary amount of data. A commitment in bitcoin is a proof that the data existed at a certain time. The consensus assumption is that everyone saw the block at roughly the timestamp that the block was labeled with, or that everyone can agree when the block was created so you get a timestamp out of that.
You’re anchoring data to the blockchain, not publishing it. Putting data on the blockchain is a so-called proof-of-publication. If you need to prove that some data was available to everyone in the world at some point, you can put it in the blockchain. But typically when people think they need to do this, what they really need to do is anchor their data to the chain. There are some use cases where you really need proof-of-publication but it’s much more rare. I’m not aware of a real trustless alternative to using a blockchain for this, but I really wish people wouldn’t put all this arbitrary data in the blockchain.
Hash preimages and atomic swaps
We can guarantee that a cascade of events will all happen together or not happen together. This is kind of an extension of proof-of-publication. We have something where if a piece of data gets published, like a signature or something, we can guarantee that this cascade will trigger. This is a layer 1 tech primitive that leads into some very nice things that we can do in layer 2 that make blockchains more scalable. Publishing hashes and hash preimages is an example of where you need proof-of-publication because you need to make sure that if someone takes in money and the hash shows up in public, then the other person should be able to take their money on the other side of the atomic swap.
Replacing transactions: payment and state channels
gmaxwell in a 2013 bitcointalk post– it’s funny how old a lot of this stuff is. A theme that we’re not going to talk about in this talk is all the coordination and underlying blockchain tech and moonmath and crypto tricks are relatively simple, it’s coordination that is hard. If you do the atomic swap protocol that David just described, where they ensure their swap is atomic.. Say two parties do this, but rather than publishing the final transactoin with the hash preimages, the parties exchange this data offline with each other. At this point, they have avoided publication because they have only sent the data to one another. Now they can both get their money if the other party leads. As a last step, they throw out the transactions. Well, they replace the transactions cooperatively with an ordinary transaction that spends their coins to the right places without revealing any hash preimages. This is a little bit smaller because they are no longer revealing so much data. This is of course not an atomic operation; one party can sign and take the coins. But the blockchain is there ready to enforce the hash preimage thing to take the coins. So now they can do a unregulated atomic swap knowing that if anything goes wrong, they can fallback to on-chain transactions.
This idea of replacing transactions not published to the blockchain, with the idea that you are only going to publish the final version, is a very important concept. This can be generalized to the idea of a payment channel. At a high level, ignoring all the safety features, the idea of a payment channel is that two parties take some coins and put some coins into a 2-of-2 multisig policy. They repeatedly sign transactions that give some money to the other party or whatever. They keep doing this replacement over and over again, without publishing the final state to the blockchain because they expect further updates. There’s a lot of complexity to get from this to an HTLC and modern payment channel, but it’s the basic idea.
These payment channels and replacement gets you a lot of scalability because you’re not limited in the number of transactions you can do between now and the next block that gets mined. You’re replacing transactions rather than creating new ones that need to get committed, because of the way you setup the 2-of-2 multisig setup.
Payment channels and revoking old state
One problem with payment channels is that if you keep creating new transactions repeatedly, there’s the concern that one of the parties can publish an old state. None of these transactions are inherently better than the others. Because they are always spending the same coins, only one will be valid. But the blockchain isn’t going to do anything for you to ensure that the “right one” is valid. This seems like a no-go beczause you want to cancel all of the old states each time you replace. That’s what replacement should mean. You could double spend the coins and re-start, but then you would have to create a transaction and publish it to the blockchain. So you don’t gain anything. But there’s another way, where you create a new transaction that undoes the first transaction. So you create a second transaction that spends the original coins and gives it back to the original owner, and the original transaction is now revoked. If that first transaction hits the blockchain, then the other counterparty has a penalty transaction or other transaction they can publish to get their money back. This is fine, but this requires a separate revocation transaction to be created every time you do one of these state updates. This is in fact how lightning network works today; every payment channel has some extra state that parties need to store for every single channel update. There’s some ways to mitigate that.
This idea shows up quite a bit when people are proposing extensions to bitcoin: why don’t we provide a way to cancel old transactions directly? What if transactions have a timeout, and after that timeout, they are no longer valid? Or what about a signature you can add that will invalidate the transactions? What about a way to make transactions invalid after the fact? There’s a few reasons why this doesn’t work. There’s the ability to cache transactions in your mempool. If you have some transactions cached in your mempool that can be invalidated by some event like maybe a new block appearing and invalidating it, each time one of these events happens you have to rescan all of your mempool and update the transactions or remove them. This requires expensive scans or complex reindexing or something. One further complication is that this is made worse by the fact that the blockchain can reorg. You have to rescan the chain to see which transactions have been invalidated, unconfirmed or absent. It’s a big mess. It also creates a fungibility risk because if a transaction could be invalidated– like say a transaction was in a blockchain, a reorg happens and it becomes invalid. Every transaction that spent its coins, the whole transaction tree graph below that point, becomes invalid too. Coins with an expiry mechanism are riskier to accept than other coins. This is true by the way of coinbase transactions and miner rewards. In bitcoin, we require 100 blocks go by before miners can spend their coins because we want to make sure it’s next to impossible for coins to be reorged out if a reorg would cause those coins to be invalidated. A cancelation has the same issue. A lot of these proposals just, don’t work. It’s difficult to create a revocation mechanism that actually gets broadcasted to the whole network. If you revoke a transaction and it gets confirmed anyway, without a consensus layer which a blockchain is supposed to be anyway, how do you complain about that?
Linking payment channels
This is how the lightning network works. It uses HTLCs. It’s a hashlock preimage trick to make transactions atomic. You link multiple payment channels with the same hash. You can send money through long paths. This proof-of-publication mechanism guarantees that if the last person on the path can take their money, all the way back to the original sender so the money goes all the way or it doesn’t go at all. Each of these individual linked payment channels can be updated independently by the mechanism of creating revocation transactions and replacing the original transaction. Finally, there are some other security mechanisms like some timelocks like when anything goes wrong in the protocol then money will eventually return to where it comes from. There’s thus a limited hold-up risk related to this timelock period. I’m not going to talk about the details of the timelocks or about linking payment channels across chains, or multi-asset lightning.
I just want to talk about the specific scalability tricks that go into forming these payment channels. These are quite general tools you can use, and you can use it to create cool projects on the bitcoin network without requiring a lot of resources from the bitcoin chain.
This section is about proposals that are generally speaking accepted, things that have not been deployed yet but we believe they will be deployed because the tradeoffs make sense and we like the structure. First one we’re going to talk about is an erlay.
Erlay deals with the transaction propagation layer. It’s consensus critical in the sense that if you’re not getting blocks or transactions then you can’t see the state of the network and you can be double spent. If you’re not aware of what hte most recent block is, or what the best chain is, you can be double spent and you’re at risk. On the network today, the transaction propagation system takes the most bandwidth. Because of that, we put a small cap on the number of peers that we actively go and get, which is 8, although a pull request has increased this to 10 for block propagation only not transaction propagation. 8 is enough to create a well-connected robust graph. Sometimes you see eclipse attacks where it often feels fragile.
What erlay does is it’s a set reconciliation mechanism where two different peers can say in very brief terms what transactions they have. They can do set reconciliation and they can figure out what each of them have that the others don’t. The trick here is that we substantially decrease the bandwidth we take, to stay up to date with the most recent transactions and the most recent blocks. This means it would be easy to increase from 8 peers to more. Today, when you quadruple your peers, you quadruple your bandwidth overhead. So it’s linear. But with erlay, it’s closer to constant. The amount of bandwidth with 100 peers, with erlay, is not as high as the previous linear bandwidth overhead. This reduces the cost of running a node and staying well connected.
This was developed in part by gleb naumenko. Ah, he’s hiding in the audience. Hopefully he wasn’t shaking his head because of anythin wrong we might have said.
Q: If we’re talking adversarially, then what’s the adversarial model here? It’s optimistically constant time. But what about the adversarial case?
A: I believe it remains constant time. You’re connected to peers. If you have an adversarial peer and they are consuming more bandwidth, you can detect that they are an adversarial peer. You can decrease their score and ban them. Under the erlay model, I believe an adversary cannot interrupt or degrade your connects with good peers, and you also get tools for detecting which ones are good peers.
Q: So you have to hold these transactions in order to give those sketches? Can I spoof having the exact same set as you?
A: No. You cannot spoof the sketches unless you have the actual transaction data. Or at least unless you have the transaction hashes. What you’re doing by spoofing those, you’re increasing your score. I see. I don’t think you would accomplish a lot by such an attack, other than avoiding getting yourself banned for being a spy node. If you were a spy node, and you had no transactions and you weren’t broadcasting new transactions, then you can’t spoof these sketches but even if you could then that’s a minimal harm thing to do.
Q: Is this the minisketch library from Pieter Wuille?
A: Yes. Erlay is transaction propagation– the proposal for transaction propagation in the bitcoin network. A key component of it is a set reconciliation algorithm, which is how the minisketch comes in. It’s for set equality and reconciliation. Minisketch is part of erlay.
Q: Does increasing the number of connected nodes, give privacy concerns to be connected to more spy nodes?
A: So the concern with connecting to more spy nodes is that, those nodes– so if a spy node can create a lot of connections, then it can see where transactions are being broadcast from and infer things about their origin. By use of erlay, if you connected to more spy nodes and not more real nodes, then I guess that would make things worse. But if everyone is connected to more nodes, then transactions can get further into the network using fewer hops which gives less information to spy nodes even if they have more connections. Dandelion can resolve a lot of these issues too. If you have a mechanism to drop transactions from random places, being connected to more spy nodes is not going to harm your privacy.
Compact threshold signatures
The next few slides are building up to a proposal called taproot but let me talk about some components first.
The first big hyped component of the taproot proposal for bitcoin, and hopefully– it’s an almost complete proposal. In any world except bitcoin, it would already be considered complete. Anyway, the first component is something called Schnorr signatures. As Ruben Somsen described, Schnorr signatures are an alternative to ECDSA. It’s a digital signature algorithm. It’s a replacement for ECDSA. It’s functionally the same. They are in many ways very similar. With both Schnorr and ECDSA, you can get compact 64 byte signatures. In both Schnorr and ECDSA you can do batch validation. But with Schnorr signatures, a lot of these things are much simpler and require fewer cryptographic assumptions. Secondly, the ECDSA signatures deployed in bitcoin are openssl-style ECDSA signatures. Satoshi originally created these by using the openssl library and encoding whatever came out of it. As a result, ECDSA signatures on bitcoin today are 72 bytes instead of 64 bytes and there’s literally no value gained by this. These are extra encoding bytes that aren’t needed at all. This is useful for larger crypto systems where you have objects that aren’t signatures and you don’t care about a few bytes; but in bitcoin we certianly do care about several bytes times 400 million transactions.
Another unfortunate thing about ECDSA signatures in bitcoin is that you can’t batch validation. This is half from ECDSA but half of how bitcoin does things. Basically the ECDSA verification equation only checks the x coordinate of some of the curve points that are involved. It’s a true statement that every x coordinate that corresponds to a curve point actually corresponds to two curve points. There are actually two different points that could be used in ECDSA signatures that would both validate. This is not a security loss, it’s not interesting in any way except that this fact means that when you try to batch validate ECDSA signatures and you have 1000 of them and you’re trying to combine them into an equation, the inputs to that equation are ambiguous. There’s this factor of 2 where you have to guess which points you’re working on; as a result, there’s a good chance your batch validation will fail even if the signatures are valid. We could in principle fix this with ECDSA. But we don’t want to; if we’re going to replace ECDSA signatures, all the complexity comes from the consensus layer, so we might as well replace the whole thing with Schnorr signatures anyway.
The first feature of Schnorr signatures that is really genuinely much easier with Schnorr signatures is this thing called threshold signatures. Ruben almost talked about this. Say you have multiple parties that together want to sign a transaction. The way that this works in bitcoin today is that they both contribute keys, they both contribute signatures, validators verify keys and signatures independently because they can’t batch them. It’s 2x the participants and 2x the resource usages.
But with Schnorr signatures, it’s very simple for these parties to instead combine the public keys, get a joint public key that represents both of them, and then jointly produce a signature such that neither of the individual parties are able to make a signature by themselves. This is identical to 2-of-2 in semantics, but what hits the blockchain is a single key and a single signature. This has a whole pile of benefits, like scalability because there’s much less data hitting the chian. Also, this improves privacy because blockchain validators no longer know the number of participants in this transaction. They can’t distinguish normal transactions from 2-of-2 things. Another benefit is that you are no longer restricted to what the blockchain supports. The blockchain supports signatures; you don’t have to worry about how many signatures you’re allowed to use or whatever. There’s all sorts of questions you have to ask right now to use native multisig in bitcoin, which you can avoid.
With some complexity, you can generalize this further. I said 3-of-3 and 5-of-5, but you can actually do thresholds as well. You can do k-of-n for any k and any n provided n > k. Similarly, this produces one key and one signature. You can generalize this even further, and this could be an hour long talk itself. You could do arbitrary montone functions where you can define– you can take a big set of signers and define any collection of subsets who should be able to sign, and you could produce a threshold signature like algorithm for which any of the admissable subsets would be allowed to sign and no other subsets. Monotone just means that if some set of signers is allowed, then any larger set is also allowed. It’s almost tautological in the blockchain context.
Ruben gave a whole talk on adaptor signatures. We use the proof of publication trick and we encode this in the signature. Rather than a hash preimage in the script, you have this hash point and this extra secret. This completely replaces the hash in the preimage. What’s cool about this is that once again all you need is signature support from the blockchain, you don’t need additional consensus rule changes. You just throw a signature on the chain. Another benefit is privacy. There’s a lot of cool privacy tricks. One signature hits the chain, no indication that adaptor signatures were involved. You can also do reblinding and other tricks.
The proof-of-publication requirement we had for the preimages and the hashes and the preimage, now become the proof-of-publication requirement for the signatures. That’s kind of inherent to a blockchain. Blockchains have transactions with signatures. There’s no way with existing tech to eliminate the requirement that all the signatures get published, because the chain needs to be publicly verifiable and everyone needs to be able to verify the transactions which means they need to be able to validate the signatures. We get more capabilities with adaptor signatures, and we require less out of the blockchain. That’s my definition of scalability.
Taproot: main idea
Before I talk about taproot… Up til now, I have been talking about Schnorr signatures and how you can combine these to have arbitrary complex policies or protocols. It’s one key, one signature. The idea behind taproot is that if you can really do so much with one key and one signature, why don’t we just make all the outputs in bitcoin be one key and all the witnesses be one signature? Instead of bringing in this whole script evaluation aparatus, then we can bring in one key one signature and then that’s great. Your validation condition is a nice little algebraic equation and there’s little room for weird consensus edge cases. You also get batch validation. I should mention that this is kind of how mimblewimble works: you only have an ability to verify signature. Historically, this whole adaptor signature business came out of mimblewimble and in particular the question of how to add scripting to mimblewimble. As soon as I came up with this idea, I brought it to bitcoin and forgot about mimblewimble. As soon as I could do it with bitcoin, I ran away.
But bitcoin isn’t mimblewimble, and people use scripts like timelocks and lightning channels. petertodd has some weird things like coins out there that you can get if you collide sha1 or sha256 or basically any of the hash functions that bitcoin supports. You can implement hash collision bounties and the blockchain enforces it.
(An explanation of the following Q&A exchange can be found here.)
Q: Can you do timelocks iwth adaptor signatures?
A: There’s a few ways to do it, but they aren’t elegant. You could have a clock oracle that produces a signature.
Q: You make an adaptor signature for the redeem part, but then you do a joint musig signature on another transaction with a locktime of just… rather than having to do script.
A: That’s cool.
Q: You didn’t know that?
A: This is one way it’s being proposed by mimblewimble; but this requires the ability to aggregate signatures across transactions.
Q: No, there’s two transactions already existing. Before locktime, you can spend wit hthe adaptor signature one like atomic swaps. After locktime, the other one becomes valid and you can spend with that. They just double spend each other.
A: You’d have to diagram that out for me. There’s a few ways to do this, some that I know, but yours isn’t one of them.
For timelocks, it appears that you need script support. That’s my current belief at this time of day.
Taproot does still support script. The cool thing is that taproot hides the script inside the key. You take the public key, and you can sign with it. But you can transform this key into a commitment to arbitrary data. The way you do this technically is you take the key and the extra data, you hash it all up, you multiply by the curve generator, and you add the result to your original key. So now you have a key that looks and acts just like an ordinary key, it’s still 32 bytes. It’s still 32 bytes. It’s still something that can sign. There’s nothing remarkable about it. Outside, you can’t tell there’s anything special about it. But secreltly, it has committed to extra script data. If you need to use the extra script features, then you have the option to do so. So you reveal the script and show that, and the validators once they see the information, are able to check that the pubkey tweak was produced correctly and they can see that the key was committing to this script, and now I will allow the script to spend the coins and a valid witness for that script. If you need the script, you reveal a witness. If you don’t need the script, you publish nothing not even the hash of the script. Pretty much all of these applications can fit in these keys, except the hash bounty.
Q: … MAST?
A: Does taproot supercede MAST?
I have a slide about this. Does taproot supercede MAST? MAST is this idea that has been floating around since 2012. It might have been russell o’connor on bitcointalk. I should stop repeating this history until I double check. The idea of a merkleized abstract syntax tree is that you have a script that have a whole bunch of independent conditions. You put them into a merkle tree, and you only reveal the branches thta you want to use at spending time. The other branches stay hidden under the other cryptographic hash commitments. There’s almost no tradeoff to using MAST over not using MAST, except that there’s many ways to do MAST. There’s perpetual bikeshedding and so many competing designs for MAST and none of them are any particularly better than the others. In the early days, nobody was using script for anything interesting, so the practical benefit of MAST back then, wouldn’t have been useful.
Instead of committing to a script, you can commit to a merkle tree of scripts. So there you go, that’s MAST. So in fact, the taproot proposal includes MAST as part of it. It’s separate from the committing structure I defined, it’s there, with a few couple advanced cool things that we noticed as part of implementation. We have a form of MAST where you provide a proof showing where in the tree the script you’re using is, we have a way of hiding the direction it’s taken. So it’s difficult to tell if a tree is unbalanced, and you learn very little information about the shape of the tree, and a few other efficiency things like that which I don’t want to get into. They are in Pieter Wuille’s fork of the bips repo which has a branch that has a draft BIP that has a few pull requests open on. It’s the first hit on google for bip-taproot so you can find it.
One other thing I wanted to mention here is that we eliminate the CHECKMULTISIG opcode in bip-taproot. I mentioned that this is the way in bitcoin how you provide multiple keys and multiple signatures. This opcode kinda sucks. The way it works is that the verifier goes through the list of keys, and for every key they try a signature until they find one that validates with that signature. So you wind up doing a lot of unnecessary signature validations, they just fail. They areu nnecessary. On its own, this prevents batch verification even if the signature algorithm itself was supposed to support batch validation. This wastes a lot of time wiating for invalid ones to fail. Russell O’Connor came up with a way to avoid this with pubkey recovery. It’s a very irritating opcode. It’s irritating for a pile of other reasons. I could do a full talk abou why I hate OP_CHECKMULTISIG. Another issue is that there’s an opcode limit in bitcoin, pushes don’t count, you have a limit of 201 opcodes. You can read the script and pick out the opcodes. But there’s an exception: if you have a CHECKMULTISIG opcode, and then you have to go back and evaluate all the branches, and then if CHECKMULTISIG is executed, you add the number of keys in the CHECKMULTISIG. If it’s not executed, then you do nothing. There’s a bunch of extra code dealing with this one opcode which is just, who knows, a historical accident, and it makes everything more complex.
Q: So CHECKMULTISIG is not going to be in tapscript?
CHECKMULTISIG does have one big benefit over interactive signatures, which is that CHECKMULTISIG is non-interactive for multisig. With threshold multisignatures I’ve been describing, all the people need to interact and they need their keys online and this isn’t always practical because maybe you have keys in vaults or something. If you want to do multisig with Schnorr and taproot, we have OP_CHECKDLSADD. It acts like a CHECKMULTISIG, but when it passes, it adds 1 to your accumulator. So you just count the number of these, and then you check if the accumulator is greater than or equal to whatever your threshold is. This gives you exactly the same use cases as CHECKMULTISIG but with much simpler semantics, with the ability to batch verify, and without the insane special cases in the validation cases.
I think that’s the last taproot slide.
Revocations and SIGHASH_NOINPUT
Let’s double back a little bit to lightning.
The taproot talk was about how can we get rid of the hash preimages in lightning. There’s another issue in lightning, which is the revocation transactions. There are basically, every time you do a state update, there’s an extra transactions that both parties need to hold forever. If you’re doing watchtowers, then the watchtowers need to keep all this evergrowing state.
One proposal for bitcoin that would eliminate this complexity is this thing called SIGHASH_NOINPUT and basically what it allows you to do is create a signature that spends some coins that is valid for spending any UTXO any coin that has the same public key. Then there’s a proposal for lightning called eltoo. I think there might be other proposals that use this. It uses a feature of script to restrict this a little bit. The idea is that when you update your state in a payment channel, you createt a new transaction using SIGHASH_NOINPUT flag, and this new transaction is allowed to undo the old state and also every state that came before it. So you’re still doing these updates, but each update accomplishes the work of every previous revocation or update. You have state to keep around, but it’s just one transaction and it scales with O(1) instead of O(n). This eliminates one of the only scalability issues in lightning that is asymptotically really bad.
The lightning people are really excited about this. They are asking for SIGHASH_NOINPUT if we do tapscript or taproot. Unfortunately people are scared of NOINPUT. It’s very dangerous to have the ability to produce a signature for every single output with your key in it. Naively, anyone can send coins to your old address, and then use that NOINPUT signature to scoop those coins away from you. You would need a really badly designed wallet using NOINPUT, and people worry that these wallets would get created and users would be harmed. There’s been a lot of discussion about how to make SIGHASH_NOINPUT safe. Hopefully this will get into bitcoin at the same time that taproot does. There’s a lot of design iteration discussion on SIGHASH_NOINPUT that didn’t need to happen for all the other features of taproot. But this is a really important layer 1 update.
Q: Doesn’t this make it harder to reason about the security of bitcoin or bitcoin script?
A: Yeah. I do. The existence of sighash noinput makes it harder in general to think about bitcoin script. … With noinput, maybe somebody tricks me and asks me to do a blind signature. It turns out this blind signature is a NOINPUT signature, and now they can get all the money at once. This is the kind of problem that complicates the script model. I have a solution to this somewhere on the mailing list somewhere. It’s extra thought, extra design work. Chaperone signatures…
Initially I was opposed to SIGHASH_NOINPUT because I thought it would break a use case for various privacy tech. The fact that it doesn’t break it, is not trivial. NOINPUT is difficult. Let’s move on.
A lot of these improvements we’re just going to only touch the surface level. If you want to ask questions or go deeper on a particular one, we’re certainly happy to do that. But there’s a lot of scalability proposals out there. Some of them are more interesting than others. We tried to cherrypick things that are very beneficial for scalability that we liked, and then some that are innovative or maybe they break things. Some of them we put them in here because they are extremely popular even though we don’t like them.
Utreexo out of all of these are hte most practical and most realistic of the speculative improvements. Today one of the big bottlenecks with bitcoin is blockchain validation. In particular, you need a ton of disk space and you need to keep track of the set of coins that are currently spendable. With utreexo, we merkleize this whole thing and we keep one merkle root and we get rid of the need to keep a database. Today to run bitcoin consensus, you need database software and keeping things on disk. That’s rough. Database optimization is incredibly involved and it’s difficult to be certain that it’s implemented correctly. If we switch to utreexo, then we can throw away our database software. Every time osmeone spends, they would provide a merkle proof showing that the coin was in the merkle root. This allows us to essentially keep things on disk. We can fit an entire blockchain validator on a single ASIC. We could create a chip that can validate the bitcoin blockchain just as fast as you can download it. The major tradeoff is that every transaction needs a merkle proof alongside it, which is a couple hundred bytes. There’s a few ways to help with that though.
Tadge did a talk yesterday on utreexo.
Client side validation
Currently no ongoing research into client side validation. The idea is to not put transactions on the blockchain at all. Most output history wouldn’t be visible. Since we’re not putting it on the blockchain, that means the blockchain takes less bandwidth and takes less computation to verify. It’s much simpler. To validate an output, if someone spends coins to someone else, you send that someone whoever’s receiving the output, the sender will provide them with the entire history of that output back to the coinbase where it was originally mined.
This has a few practical challenges that can be solved with a lot of interesting engineering and a marketplace for small outputs or linearized outputs. But basically, the general idea is that people instead of having the whole history of the chain, you have your coins' whole history and you send this history when you send coins to a recipient.
This proposal is impractical for a few reasons. One problem is that if you lose your history, you lose your coins. So you can’t do seed-based recovery. There’s a lot of mission critical data that needs to be maintained. That’s a UX tradeoff which is not preferred.
The second thing that really harms client side validation is these histories– if you spend multiple outputs, then you have to include both. So these histories can blow-up a lot. You can avoid this by doing a process called linearization where you make sure you never combine outputs. So you just put in like a soft rule that outputs can never be combined and then if you need to spend a certain amount, you do trades on marketplaces. There’s a lot of engineering that has to happen to make that work. When it comes to making blocks, you get a lot of savings by doing complex signature aggregation which also implementation wise ended up being very different.
All of this together gets you at most a 5x scalability improvement over what we have today. With all the complexities, it’s not practical. But in theory, it’s sound, and it would give us a 5x scalability improvement.
Q: Aren’t the UX challenges here very similar to the off-chain techniques like in lightning network? You have to store some channel updates you can’t recover purely from seed data. I agree these are challenges, but aren’t we solving these anyway for off-chain second layer things? Even scriptless script stuff has stuff you can’t derive from the seed. It’s comparable to other problems we’re solving anyway.
A: It’s very similar in a lot of ways. The problem here is a lot stronger because for example if we switch to client side validation, then everyone has to save everything. Whereas with lightning, it’s opt-in and fractionally opt-in. I don’t need to put my whole portfolio into lightning. I can just do an amount I can afford to lose and not put a lot of coins at risk. I think the challenge is stronger here, but it’s a similar flavor.
Q: As a miner, I would just stuff random invalidation transactions into my block with as much fee as I can manage. You as a non-mining client can’t do much about this.
A: I glossed over that part. So the way we stop this, is that every– there’s two directions. The first thing is that every pubkey put into a transaction has to have a signature saying the owner of this pubkey authorizes that yes I appear in that block. Basically when I receive a proof, like here’s a giant history, and every point in that history, I have to ask the question, was it double spent between the time you say it was spent and the time that I received it? You can answer that question if you can look through the blocks and see if these public keys don’t appear anywhere in the blocks, then I know there’s no double spend. If it does appear, then you need to prove that this public key isn’t related to this output it was some other output. The other side of the defense is that if miners are making up random public keys and throwing in garbage, when the miner tries to spend that, they will not be able to provide a full historical proof. If a miner poofs coins out of nowhere, there will be no valid proof of those coins back to a coinbase transaction. To solve this, this requires a lot of signatures, and ideally you aggregate them so it doesn’t take up chain space.
Q: Other than the 5x scaling advantages, aren’t there other advantages like breaking SPV? So no more ambiguity that miners are validating. Now miners just prove publication without validation rules. Everyone really must run a full node. Plus, I guess, you would have a lot of privacy advantages not necessarily scalability.
A: I would say there are privacy advantages certainly. One thing I think about from a security model is what can the miners do that is malicious? Miners still have a validation rule. Every pubkey that gets spent in the block has to be listed in the block. The miner must include an aggregate signature or a series of aggregated signatures demonstrating that this pubkey was added to the block with permission. This gives miners less room to do manipulation. There are so many engineering challenges for linearizing history and doing signature aggregation and it just seems like too big of challenges relative to the scalability benefits you get, plus you have to move the whole chain to this model which is a non-starter.
Q: .. if you can’t rely on the miners to validate transactions and to enforce the new rules… then basically every… right?
A: No. You can add soft-forks because— a soft-fork..
Q: A soft-fork from the perspective that all the nodes… but nodes that…
A: Basically, you would get versioned outputs. Basically, if I’m on some soft-fork version, or even some hard-fork version, I have a different set of rules by which I accept outputs. When a sender sends me coins, I apply my rules to the history of those coins. You could get into situations where you have to ask someone to upgrade. If it’s a soft-fork, then you don’t have to ask them to upgrade they will blindly accept. If it’s a hard-fork, that’s another one. It allows you to add breaking changes to bitcoin without breaking everyone simultaneously. You break outputs progressively as they get spent in the new paradigm. It’s an interesting model. I haven’t thought about it before. I think that’s an interesting property of client-side validation.
This is another popular one. Ethereum uses a not-correct version of something called Ghost DAG. The bitcoin blockchain is a straight line, if you get forks or competing blocks then only one of them wins. You discard everything else. I think there was a 2015 paper called “inclusive lbockchain protocols” which presented an approach using directed acyclic graph approach to blockchain. Multiple miners can release blocks at the same time, and then there’s an algorithm to linearize those blocks and the invalid transactions you just ignore. The DAG approach solves particularly miner challenges. It allows you to have faster block times, lower safe confirmation times, it reduces the pressure on miners to have fast propagation. So fast propagation is no longer as important. It doesn’t help with initial block validation, you still have to check every output every spend. That resource bottleneck doesn’t change, therefore this isn’t seen as true scalability at least today because it doesn’t solve our biggest bottleneck.
Another thing that slows down DAGs from being accepted in bitcoin is that it’s a pretty serious hard-fork. It’s a big deal to change how the blockheaders work. This makes SPV a lot more expensive because there’s a lot more blocks. The other thing really holding this back is that it has deep game theory problems. I think there’s a research paper that went and fully fleshed out the incentives. I’ve had it on my reading list for a while but I can’t find it. Until we recover that paper, or someone else writes a new one, we don’t have a formal outlay of the incentives and we don’t have a convincing formal proof that selfish mining techniques are strictly less effective on DAG-based protocols than they are on bitcoin. I currently think this is true, but I don’t think there’s a proof out there. You wouldn’t want to switch to this. The whole point of DAGs is to make selfish mining less bad, and we need a proof that it actually achieves this, and I don’t think we have a fully fleshed out system yet.
Q: Would it be correct to say, a 51% adversary, block time isn’t really reduced– meaning the safe confirmation time?
A: Yes. Correct. DAGs are only advantageous in terms of confirmation security if we’re assuming that nobody has greater than 51% of the hashrate. If someone does have greater than 51% of the hashrate, then things are falling more back to meta incentives anyway.
Confidential transactions is the idea that you can replace the amounts with homomorphic commitments that hide what the amounts are, but they still bind to the amounts so nobody can pretend they are something they are not. Homomorphic means they can be added. Validators can add up all the inputs and all the output amounts and they can tell whether or not they balance to zero, by checking those homomorphic commitments, or by using a proof of zeroness.
Confidential transactions aren’t an asymptotic hit to validation time or size. They add a 650 byte object called rangeproofs. These originally used to be 3-4 kilobytes but thanks to bulletproofs they are now “only” 650 bytes each which is cool. They can also be accumulated across outputs so in some contexts it could be 700 bytes per transaction, but that’s still not great.
If we had confidential transactions in bitcoin, and then you remove all the other things in bitcoin, you get left with mimblewimble which has some cool scalability like removing old unspent outputs. That’s cool, but you would have to remove script from bitcoin. That’s not going to happen. Bitcoin blocks arne’t going to be replaced by mimblewimble blocks ever due to the loss of functionality.
The other issue is that these things aren’t quantum secure. Right now in bitcoin all of our signatures will be broken as soon as there are practical quantum computers that exist. If we had confidential transactions, then not only would the signatures be broken and coins vulnerable to theft, but also the soundness of the system would be in question. Anyone who claimed to own any maount of coins, just instantly everything is gibberish. petertodd has argued this is a benefit for something like mimblewimble– if a quantum computer shows up, then every single output that ever existed on the mimblewimble chain can suddenly - to every value. So the entire chain then deletes itself. So that’s pretty cool. That’s kind of crazy. But it doesn’t fit into the bitcoin ethos super well…. a very vocal segment of bitcoiners don’t want everything to go away. So until we have a quantum secure version of something like bulletproofs or confidential transactions, my feeling is that it’s not going to happen in bitcoin.
STARKs and general zero-knowledge proofs
In confidential transactions, we have a rangeproof attached to every output. This is an example of a zero-knowledge proof, specifically one to prove a range. The purpose of this is to make sure that ranges aren’t going to overflow, that the values aren’t going to be negative numbers and won’t overflow.
But in general, it’s possible to prove anything you want in zero knowledge. This has been generally known since the 1980s. Since the early 90s, it was known to be possible to be done and within the bounds of the known universe. More recently in 2013 there’s been a tremendous amount of development in really practical general zero-knowledge proof systems.
STARKs are developed by Eli Ben-Sasson’s and others here in Tel Aviv. STARKs are very fast to verify. They are asymptotically very small- they grow with the logarithm of the size of your program execution. Unfortunately, they have large constants like 10s or 100s of kilobytes. Practical STARK programs end up being 50 to 100 kilobytes. They are quantum secure, and they are fast to verify. They scale really well.
One application of STARKs if they were to be much smaller would be a quantum resistant rangeproof which would be interesting, which could lead to a confidential transaction proposal for bitcoin. But they are much more general than this. In theory, you could prove script conditions. You could hide your amount but also your spending conditions. You get something like taproot without all the tricks and the interaction and all these caveats about how sometimes you still need to reveal your scripts.
STARKs can also be used to basically do batch validation. You can produce a STARK proof of the validation of every single signature in the bitcoin blockchain for example. This is a little bit outside of feasibility but with special purpose ASICs you could probably produce such a proof. Now validation of every single signature comes down to validating this small STARK proof which is a few hundred kilobytes of size, rather than dealing with the 100s of gigabytes of data you presently need to do EC operations on. You could compress the whole blockchain signature validation workload into a single proof. We’re stretching what’s practical to encode; you would have to implement an entire script interpreter into a single zero-knowledge proof system which is very far away from our current ability to design programs.
It’s promising tech, and it’s practical for a lot of purposes even right now. It’s mindblowing and surprising that we can even do this. There’s llots of ways in the immediate future that these types of things could be used to improve the scalability of the current system, and also they interact pretty well with the other proposals. Utreexo proofs could be replaced with STARK proofs. The utreexo proofs are small, but STARKs could do aggregation of those. Also, you can do it piecemeal and you don’t need to get everyone to do that.
At some point, this will be mature and efficient enough taht we will start seeing it in the blockchain and see real scalability improvements.
We have two slides left, sharding and proof-of-stakes. My two favorite things.
Sharding is the idea is that we have the same coins and the same consensus ,but we push it into different validator zones where transactions can happen that not everyone needs to verify. Sharding overcomes the challenge of “everyone needs to see everything”. Generally speaking, we only have one really widely accepted way of doing sharding and that’s a federated sidechain and some multisig quorum that signs off on what happens off-chain. If the federation decides to steal the money, the mainchain has no way of knowing if the federation is being honest or not.
There’s also SPV proofs, which are generally considered to be not that viable or too risky. I personally don’t think that miners should be trusted. Boiling any consensus down to an SPV proof makes me very nervous, I don’t think we should go down that road.
Another sharding proposal is plasma. I’m not comfortable with the security boundaries of plasma. Many of the plasma people will assert that the security model is identical to lightning which I don’t think is correct.
I’ve found that almost every sharding proposal can be broken by asking what happens if the shard does an information withholding attack. Just doing that analysis usually lets you figure out how to totally break the system.
Q: What about gmaxwell’s coin witness proposal? Use a STARK to prove the state of this sidechain without asking miners to vote or something like that.
A: There’s a lot of technologies that like– fundamental tech steps that would need to be done.
A: The challenge on that one is not just information withholding, but double spending. If you have multiple proofs that are all valid, and you see multiple histories that transpired. How do we pick which one transpired? If you combine sharding with STARKs, maybe you find some wiggle room to make it work.
Proof-of-stake is incredibly popular in the altcoin world. It’s almost unconditionally popular. It falls back to traditional byzantine fault tolerance models. In proof-of-stake, you see 51% assumptions. Normally things fall away from incentive compatibility. I’m just going to say there’s many issues.
In 2012 and 2013 a lot of bitcoin developers did a deeper dive into proof-of-stake and identified a bunch of issues that transcend any particular issues. Like holding on to your keys after you spend your coins, that’s a fundamental issue. Andrew wrote two papers about the fundamental challenges with proof-of-stake. I think it’s fair to say since 2013 none of those issues have been addressed by proof-of-stake proposals.
Unfortunately, there are so many ways to approach proof-of-stake paradigms that it’s hard to talk through because once you dismantle any implementation everyone is like “well yeah that was the broken one”. At some point, I hope there’s a proof-of-stake implementation that everyone likes and we can attack it directly. But it’s just too much.
For scalability, you can have independent blockchains like bitcoin and litecoin. You don’t need sharding or anything. As a dogecoiner, I don’t need to care about what happens on the bitcoin blockchain. If we use cross-chain lightning, I receive doge not bitcoin, and that’s scalability. We can have on-chain on multiple chains, and then each chain has a separate security domain and a separate chain and their own problems.