Half Signature Aggregation-What is it and how does it help scale Bitcoin?
Speakers: Jonas Nick, Tim Ruffing
Date: August 2, 2022
Transcript By: Stephan Livera
Media: https://www.youtube.com/watch?v=sUpITup2igE
podcast: https://stephanlivera.com/episode/400/
Stephan Livera: Jonas and Tim, welcome back to the show. Great to chat with you guys again. And so we’re gonna chat about half signature aggregation and hear a little bit from you guys about what you’re working on and just get into what that means for Bitcoin. So Jonas, I know you probably will be best to give some background on some of this — I know you did a talk at Adopting Bitcoin in November of last year, so call it 7–8 months ago, talking about some of this at a high level. So do you want to just give us a bit of an overview here? Like, What is signature aggregation?
Jonas Nick [0:46]: Okay, so signature aggregation, as the name kind of suggests, is the process of taking individual signatures — for example: Schnorr signatures — and then produce a single aggregate signature. So you could imagine that Alice signs a check and Bob signs a check. And with a signature aggregation scheme, they could instead have a single signature such that both checks are valid instead of each check having their own signature. And in the realm of digital signatures, the purpose of this is that we reduce the size of the signature data that’s being sent, because now instead of sending two signatures, you’re only sending this single aggregate signature. And depending on what kind of aggregation scheme is being used, the size of the aggregate signature should be either constantly small or a bit larger but at least not as large as all of the individual signatures combined.
Stephan Livera: Great. And Tim, do you want to add anything there in terms of your approach and work here?
Tim Ruffing [2:11]: Yeah, so I think one point that we could stress here is that — I mean, Jonas mentioned that the point of this is to reduce the size of the signature data, and yeah that’s very important in blockchain systems because space on the blockchain is so expensive so we usually try to squeeze out every single bit out of size optimizations just to reduce the amount of data that we have to store on the blockchain, because this is really the bottleneck in Bitcoin and in every blockchain system. A lot of data needs to be stored on the blockchain, but everybody needs to download the blockchain and verify it, so this is really the bottleneck. And this is probably a little bit more important than even optimizing for computation time! We really want to reduce the data that we need to store on the blockchain.
Jonas Nick [3:18]: And perhaps another point worth mentioning here is that: now, often people are confused [about] how does signature aggregation relate to our work on multisignatures and threshold signatures and MuSig, etc.? So what exactly is the difference? And the main difference is that the messages being signed in signature aggregation are different, whereas the messages being signed in a multisignature scheme are not different — so there’s only a single message. So back to our example: if Alice and Bob would sign the same check for whatever reason — a multisig check — then they could produce a single multisignature instead of having individual signatures, and this single multisignature would be smaller than the individual signatures. So that would be the application of multisignatures if there was only a single check. But in our example before, we had two different checks, and these are kind of different messages and therefore we cannot apply a multisignature scheme here. What we need is signature aggregation, because the signatures from Alice and Bob, they actually sign different messages.
Stephan Livera: Yeah so I think this was a common confusion when people weren’t quite understanding what was going on with Taproot, because they weren’t understanding the difference between signature aggregation — just the concept — and then what you’re referring to here, which the term now people are using is this idea of cross-input signature aggregation, right?
Tim Ruffing [4:56]: Right — but that’s kind of another distinction.
Stephan Livera: That’s one specific idea, right?
Tim Ruffing [5:01]: Maybe let me again add one point to what Jonas said. He said: The difference for multisig is that everybody kind of signs the same thing. But going one level up — looking at this from a higher level — what kind of application does this relate to? And maybe that’s another useful distinction. For multisig, it’s usually when there is a cooperation between the signers, right? That’s the reason why they want to sign the same thing, because maybe they have a contract — some people might call it a smart contract, but whatever — so maybe they are in a Lightning channel, maybe it’s just an account that requires multiple signatures to spend to increase security, but there’s some interaction between those signers and they have a common goal! Whereas in signature aggregation, this is really just a technical optimization. So the signers really want to sign their own messages in an independent way, but it’s just making use of the fact that, later, we can aggregate some of the data and compress it — make it smaller — such that you can save space on the blockchain. But on a functional level: what they sign and how they behave and what they do is basically totally independent! There’s no interaction between the signers on the functionality level.
Stephan Livera: Yeah. And perhaps if you could explain for listeners: what’s the difference then between a non-interactive protocol and an interactive protocol? And what are some of the trade-offs of those two ideas?
Jonas Nick [6:50]: So right now I think the two signature schemes that we are looking at and that are known at this point are what we call half aggregation and what we call full aggregation. And these schemes have different features. So if we look at half aggregation first: half aggregation produces an aggregate signature that is half as large as the sum of the individual signatures. So in a BIP340 Schnorr signature it’s 64 bytes, and a half aggregate signature of a bunch of BIP340 signatures it is not 64 bytes but it’s like: If there are N signatures, it’s basically N times 32, or N times 64 divided by 2. So that is the size of the aggregate signature. And what is also appealing about this scheme is that this whole process is non-interactive, which means that: in order to produce an aggregate signature, there needs to be no coordination between the aggregator or the signers or the verifiers. Instead, aggregation really is a pure function of the individual signatures. So you have a function that takes the individual Schnorr signatures and outputs a half aggregate signature.
Tim Ruffing [8:26]: By the way, this again shows how there is no business relationship between the signers, right? What Jonas describes is that: each signer outputs a signature as they do normally, and then everybody can get those two signatures and aggregate them and they don’t even need to talk to the signers — they don’t even need to be aware of that happening. So it’s really a process then, or a function, that everybody can compute — just take the two signatures from the signers who totally may not be aware that this is happening, and then compress them into a single aggregate signature.
Stephan Livera: Yeah, so I presume then that we would say that has some advantage or benefit in that we don’t have to deal — well, on the Internet there could be malicious actors or people who are maybe honest but not doing it correctly, and there may be situations where somebody is DDoSing somebody else. And so I guess that’s one of the benefits there in this non-interactive case, because then you don’t have to try to interact with people, and now that means you don’t have to deal with getting DDoSed or [deal with] some kind of malicious party. So from that point of view, it’s better.
Tim Ruffing [9:37]: Or, going a step back: it’s just simpler on every level, because it’s just a single operation that you can run on your own machine without — of course, you need to receive the signatures from somewhere, right? But when we say it’s non-interactive, we mean: the other parties don’t need to be online at the same time. The signers are already gone, you get the signatures from somewhere, and then you can just run this process. And because you don’t need to talk to other people and have a network connection to other people, this makes things so much easier on every level of engineering. Also security — what you mentioned: if you don’t have a network connection, nobody can run a DDoS attack on that network connection, for example. But yeah, it’s really just — from an engineering point of view — so much easier! Another point is: when you have connections — in particular to multiple people — usually you need some kind of meeting point. For example, imagine a group chat: you need some server where that chat takes place because we all need to be connected to the same central point. You could also have peer-to-peer connections to each-other but it’s even more complicated, so we usually want to have some kind of untrusted point in the middle that we all connect to just to simplify establishing the connections. And then we need to agree on a point where to meet — and well if you don’t need that interaction in the first place we also get rid of that problem! We don’t need a place where we can meet.
Stephan Livera: So could you also now spell out the difference for us with half aggregation and what we’re calling full aggregation?
Tim Ruffing [11:32]: Yeah, sorry Jonas — go ahead. I talked a lot about non-interactivity but let’s talk about the interactive thing.
Jonas Nick [11:38]: Okay, so the interactive thing is full aggregation. We call this full aggregation because the output of this aggregation scheme is actually a signature that is as large as a BIP340 Schnorr signature — so 64 bytes, independent of how many signers there are or how many messages there are. So that sounds quite attractive compared to half aggregation, but it has trade-offs. And the exact kind of trade-offs are not really known! We’re speculating at this point, because this is still an active area of research. But I think so far what this scheme seems to be in terms of interactivity and number of rounds and things like that — it seems to be very similar to MuSig2. And this means that, for example, it requires two rounds to produce a signature. So I think we were here [on this podcast] almost 2 years ago and talked about MuSig2 and all the different aspects of that and what it means to have multiple rounds. And for full aggregation — at least as far as we know — we would have similar properties, so there’s like a three-round scheme which would be full aggregation, and then we could use the similar techniques as we used in MuSig2 to get rid of one round and we only have two rounds. But we also have this problem that we discussed earlier that, for example, the signers need to be online, they need to keep their secret keys somehow, they need to maintain state securely — that shouldn’t be able to be subject to a replay attack or even modification by an attacker. So from an engineering perspective, it’s quite a bit more complicated, but what we get in the end is then a much smaller signature than if we would apply half aggregation.
Tim Ruffing [13:59]: And when you say that it has two rounds, what we really mean here is two rounds of interaction. So one round of interaction basically is: each of the signers is sending a message to everybody else. And then they process the message and then they see they need to send another message to everybody — so this is the second round.
Stephan Livera: Yeah, and that would significantly complicate things.
Tim Ruffing [14:22]: And when we say round, we really mean communication round or round of interaction.
Stephan Livera: Yeah. And so let’s try to talk about some of the practical ideas of how this would work, because as I’ve seen in some of your blog posts and some of your videos I was looking [at], it seems that there’s two main ideas: 1) is this idea of block-wide signature aggregation, and then the other one is 2) the L2 gossip protocols. Like, so for example: in the Lightning Network, the nodes gossip around some information to each other. So could we just talk a little bit about some of those applications? Maybe starting with this idea of the block-wide signature aggregation? And let’s try to put that into context for listeners who are thinking, What does that mean for me? So could you just explain a little bit about those applications?
Tim Ruffing [15:07]: Maybe I can try the block-wide aggregation and Jonas can talk about Lightning because he knows better. So yeah, so far we just basically explained what the basic primitive is, right? What signature aggregation on a technical level can do, but we haven’t talked about the applications — how can we make use of that in our real protocols. And maybe the most obvious thing is something I already hinted at, is: we could save space on the blockchain. And the idea is that: for half aggregation — because it’s really a non-interactive process and everybody could do it, the idea is that when you sign a transaction you will still broadcast it to the peer-to-peer network so it would be floated around in the network and gossiped among all nodes. And at some point a miner will learn all of these transactions and want to put them in a block, and now we could assign the miner the task to aggregate those signatures before putting them into a block. So the miner would collect all of the signatures he would want to put on the block, and the miner would also be incentivized to aggregate them because this means they need less space in the block so he can fit more transactions in the block, which means more transaction fees for the miner. So the miner could be given the opportunity or maybe even could be required to aggregate the signatures, depending on how we would implement this. And so this would just mean that, in the block, all Schnorr signatures — because we can’t do this for ECDSA but just for Schnorr signatures — all Schnorr signatures in the block would be half aggregated and this saves a lot of space in the block so we can just fit in more transactions in the block. And what I just explained is just the basic idea, and this has a bunch of open questions in terms of how would we engineer that? Would this maybe disable certain features that we also want to have? And so on. But this is basically the simple idea: miners would take all the transactions — instead of putting them in the block just as they are — and aggregate them before adding them to the block.
Stephan Livera: Okay, so the idea would be: all the people in Bitcoin are transacting and [there’s] 2,000 transactions in the block or maybe 3,00 — something in that range — and then miners, before they put that block out and that block is confirmed, they’re taking all the signatures and then they’re doing this mathematical cryptographic operation to do the half aggregation, as we said, in a non-interactive way. Because if I’m the miner, I can’t contact every person who’s transacting — I’m just using my non-interactive protocol to half aggregate, in this example, and is that giving us a 15%-20% saving [in block space]? Like, if you were just giving a ballpark on the assumption that everybody in that block is doing half aggregation, what kind of savings are we talking about here?
Tim Ruffing: Well Jonas has a nice table, I think.
Jonas Nick [18:42]: We need to distinguish them between two things: one thing is the size in terms of bytes that your transaction takes in a block, and also the weight of the transaction, which is this concept that was introduced by segregated witness. The weight is different compared to bytes because the witness data of a transaction has less weight than all the other parts of the transaction, which means that the signature is already discounted in terms of how much you need to pay for it, because the transaction fees depend on the weight and not on the actual bytes that a transaction takes in the block. So if we look how at much this could save in terms of bytes for an average transaction, it’s on the order of 20%, and in terms of weight it’s more on the order of 10%. So you might pay 10% less if your transaction is able to be half aggregated — for an average transaction. For some other transactions, you may have different numbers here.
Stephan Livera: Yeah, right. And so I’m curious then if, as an example, a transaction has many inputs, then is the saving bigger in that case? Or it doesn’t matter?
Jonas Nick [20:26]: If it has many inputs, the saving is bigger.
Stephan Livera: Yeah, so as an example: for those people who are thinking about CoinJoin and things like that, presumably then that might help increase the saving a little bit, at least relative to current state — that they may get a slightly larger saving than even that 10%-20%, depending on if we’re counting by transaction weight or bytes.
Tim Ruffing [20:56]: If you’re talking about half regulation, of course you can take the perspective of a single signer that wants to transact and makes a transaction that has multiple inputs. But when we talk about this block-wide aggregation, maybe that’s kind of the wrong perspective because the person who sends the transaction wouldn’t decide whether half aggregation takes place. It would be be the miner in the end that just aggregates them. So the numbers are right: you save 10% of weight units per transaction, but it’s kind of difficult to say that there’s more incentive to do this for a transaction with more inputs because it wouldn’t be the guy who sends the transaction — it would be the miner that just decides to aggregate everything.
Stephan Livera: I see. You’re right! I guess I was just talking on the assumption that we can get more adoption, and I think that’s maybe the same way people even talk about SegWit adoption.
Tim Ruffing [22:15]: Oh right — that’s a point I actually missed now: so you’re perfectly right! Again, when we talk about these aggregation schemes, we’re talking about [them] only applied to Schnorr signatures. So in that sense you’re perfectly right: so this would be an incentive for everybody to switch to Taproot outputs and Schnorr outputs and so on because then when they transact they would need less data and maybe pay less fees.
Stephan Livera: Right, and this is the same kind of argument like when SegWit was just being adopted, or now with Taproot just being adopted — at the start there’s not that many people using Taproot, but hopefully over time everyone moves over to using the new, better stuff. But that’s hypothetical, right? And then even in that world, let’s say we’re doing half aggregation signatures and we’re talking about blockwide miners: it also would be a question of how many people would use the Taproot and use the half aggregation — the miners, in this case — because maybe there would be some people who are just operating on the old ECDSA just using legacy or just SegWit without Taproot transactions, right?
Tim Ruffing [23:46]: So if we would add blockwide half aggregation — just hypothetically — it would be another incentive to use Taproot and Schnorr signatures, which I think is a good thing. But of course, we also don’t want to force anybody to switch to the new things.
Stephan Livera: Yeah. And to be clear: what we’re talking about here would be a soft fork, right?
Tim Ruffing: I think, yes — I think Jonas knows more also about the trade-offs.
Jonas Nick [24:16]: Yep. So I mean there’s no complete proposal for how to do this like in a Bitcoin Improvement Proposal or something like that, so there’s a lot of speculation still. But the idea is, of course, to make a soft fork! Otherwise it wouldn’t be as attractive as it is if it required a hard fork.
Stephan Livera: Yeah. And I think you’ve mentioned as well that there is one downside, potentially, around adaptor signatures and that that might be being used for things like submarine swaps — maybe even today, with people trying to do Lightning in the atomic off and on-chain swapping. Would that be potentially impacted in this case?
Tim Ruffing [25:07]: Right, so there is a relation to adaptor signatures. So adaptor signatures is another cryptographic technique that is very interesting to use in Bitcoin that’s now possible — with Schnorr signatures it was also possible, with ECDSA signatures some people use this — and it really saves space, again, and increases privacy for some particular kinds of contracts. For example, you mentioned submarine swaps. So typically it’s applicable to all kinds of swaps because the basic pattern is that one party makes a transaction and then the party is forced to reveal some data, and this kind of data enables another party to do a transaction — this really looks like a swap. So if I take the money from you, then I have to reveal the data that allows you to take the money from me and then we get a swap. This is the very high level idea of adaptor signatures. And adaptor signatures have this aspect of revealing data, so when I do a transaction, I need to reveal a particular piece of data and I reveal that piece of data in the signature. And now this is exactly the point where it gets complicated with half aggregation, because now if we take that signature and half aggregate it with a lot of other signatures, we might destroy that particular piece of information that the counterparty needs to complete the swap! And so our current thinking is that maybe it can still be done if we restrict half aggregation to only Taproot outputs — so really, the top level key spends. Some people might know what this is, some people may not, but in the end the idea is it’s just that you restrict it: you can’t half aggregate every signature on the blockchain, but only some signatures on the blockchain. And then if you really need to rely on this adaptor signature idea, you could make sure that your signature can’t be aggregated, and this would usually solve the problem — at least this is what we hope, but we need to look into this, or there’s more research necessary.
Jonas Nick [27:49]: Yeah, and this is especially the case because generally, for adaptor signature protocols — in the cooperative case — the adaptor signature isn’t even used. So if both parties are cooperative, they can still do Taproot key spends or whatever, but only if there is some dispute. They would need to opt out of this half aggregation mechanism and use this adaptor signature.
Tim Ruffing [28:14]: Right. So when I said we have some places on the blockchain where you can’t aggregate and some where you can aggregate, it looks like we’re giving up on some of the savings, but we’re mostly not! So as long as both parties in the swap or whatever smart contract you have still cooperate and agree on what they want to do, they can still get all the optimizations. And only if one party is maybe malicious or goes offline — then we really need to perform more work! Then we would run into a case where we can’t do half aggregation, but maybe that’s really the rare exception.
Jonas Nick [28:49]: Yeah, but to be honest, if that is actually the case for any adaptor signature protocol that you could think of, I think that is still an open question.
Tim Ruffing [29:00]: Right, yeah — that’s what I meant. So we looked at some basic swaps and it seems to work out for this, but it’s hard to generalize this to every protocol you could think of. I guess there is always a trade-off. One problematic thing here is that we don’t really know what people use at the moment because we can’t see these adaptor signatures on the blockchain because they look like normal signatures. So we are aware of some people using this, but maybe there is some secret group of people somewhere in the world who uses this for very fancy stuff and they talk to nobody about it and then we would maybe destroy their —
Stephan Livera: You’re gonna destroy their use case or their business model or something. But we should at least talk about what are the key benefits. So in your view, what do you see as the key benefits of using half aggregation? What’s the main benefit?
Tim Ruffing [30:04]: The main benefit is really you save space, and saving space on the blockchain means you can fit more transactions in the block and this gives you layer one scalability. It’s not much, but as is always, we improve a little bit on layer one — it’s not much, but it’s the lowest layer. I think that we should really try to squeeze out every bit out of layer one.
Stephan Livera: Anything to add from your point of view, Jonas? Or you basically agree there?
Jonas Nick [30:35]: I fully agree there.
Stephan Livera: Okay. So yeah, like you’re saying, it mostly helps around scalability, it might potentially help a little bit around privacy — maybe that’s also a bit of an open question as well, depending on how things happen. And we should also talk a little bit about the other application that we mentioned as well: the L2 gossip protocol. So Jonas, do you want to tell us how half aggregation could help there?
Jonas Nick [31:04]: Yep. So generally in terms of application in the Bitcoin space, you can make a distinction between applications that affect the consensus layer — we talked about blockwide aggregation but there are other things: cross-input aggregation that only affect transactions, or just aggregation within the script of Opcode, whatever — but there’s also applications outside of the consensus layer that affect, let’s say, layer twos. And one example of this is the Lightning gossip network. And in the Lightning gossip network, to be a participant, you need to open a channel. And if it’s a public channel then you send out a channel announcement message and this channel announcement contains signatures. And right now these are ECDSA signatures, whatever, but it could be upgraded to be Schnorr signatures. And then — in this network that generally gossips these channel announcements around — there could be nodes that collect channel announcements and their signatures, of course, and then put them in a batch, compress the signatures using half aggregation, and send out the batch to the next party. Because, right now, what happens will be: you receive a bunch of channel announcements and then you just send them out. Whereas, if we would be able to apply half aggregation there, then nodes in the network could just take channel announcement messages, compress the signatures, and then be able to send out a batch that is smaller than the individual channel announcements. And channel announcements are — compared to the other data that is sent around — quite large. And signature data within the channel announcements are pretty large as well. So you could get I think relatively significant savings here by applying these techniques. But to be honest, I don’t know how much this would affect user experience, but it would certainly affect the efficiency of this network! And here again, we’re making use of the fact that half aggregation is non-interactive, so the nodes on the network can just do the aggregation without needing any cooperation from anyone else.
Stephan Livera: Interesting. And so, hypothetically: if people in Lightning are already used to interactivity, is that something that means maybe even full aggregation is better or even more savings in that case?
Jonas Nick [33:51]: Okay, that’s an interesting question! But I think in terms of engineering complexity, that would be pretty hard because if you have multiple nodes that want to open a channel, they somehow need to cooperate to create these channels and then create a single signature for all their channel announcement messages. It’s not impossible, but then you need to wait for your channel announcements before other people are there that want to sign a message with you. So I think it’s possible, but not sure if that is applicable in this case.
Tim Ruffing [34:30]: It looks a little bit like a chicken and egg problem, because you’re trying to optimize the process of opening a channel, which is basically the process of starting [the] interaction. So at this point we don’t have interaction yet, so if you need more interaction to create interaction, maybe that’s not the best approach. But I haven’t deeply thought about it.
Stephan Livera: I see, yeah. Because I can imagine — like, let’s say we’re all running our Lightning nodes and I get your node pubkey and then my node connects to yours on the Lightning peer-to-peer layer, and then maybe we’re exchanging some information. Maybe that is an example? I don’t know — people are talking about ideas like batch channel opening as well in Lightning, so that might be kind of an interesting idea. Like, let’s say I want to open a channel with both you, Jonas, and with you, Tim, and maybe even another two or three people, and my node talks to all of your nodes and we collectively or collaboratively make this batch open. Maybe that’s kind of an idea to get a big saving by one transaction opening five channels in an interactive way? So maybe that’s a future direction that Lightning Network may go. And that’s just one example, right? We’re just talking about Lightning — maybe other protocols may find applications and uses there, also.
Jonas Nick [35:51]: Yep, exactly. So if you already do batch opens, then you already require some kind of cooperation between the nodes. So in that case I think something like full aggregation at least seems like an area to explore.
Stephan Livera: Yeah. And so just to talk about the difference then in cost saving between half aggregation and full aggregation. So if you could just spell out for us just how much that cost saving is? Because it sounds to me like, from what you’re saying, if we were talking 64N, does that mean everything is just one signature for the whole — like, no matter how many people?
Jonas Nick [36:35]: Yep, correct: no matter how many people — that’s correct. But of course, you cannot do full aggregation for a whole block — or at least it would change Bitcoin in quite a significant way, because then all the signers that spend coins or UTXOs in a block need to cooperate to create this single blockwide full aggregated signature, and that seems to change or modify Bitcoin quite a bit. Therefore, this full aggregation scheme is mostly proposed to do cross-input aggregation within a transaction, because users who create a transaction are either just a single entity that perhaps controls multiple wallets or whatever, or they’re doing a CoinJoin with other parties. But in either case, there’s already interactivity to create a transaction because you first need to agree on the transaction, then you need to sign the transaction. So you could also add this process of full aggregation to this, and then you would only have — in the best case — a 64 byte signature for a whole transaction. Now for a typical transaction — not a CoinJoin transaction — how much does this save? This saves a bit more than half aggregation. So in terms of bytes this would save 25%, and in terms of weight units — since it’s only about a single transaction — it saves 10%.
Stephan Livera: Interesting. So just to clarify one idea: could we someday — like, imagine there’s been further research and we have established protocols and ideas around how half aggregation works and how full aggregation works. Could we actually live in a world where, hypothetically, there might be a Bitcoin block and it’s just got a combination: it’s got like some old school ECDSA stuff, it’s got some stuff that’s half aggregated, and some transactions that are full aggregated. Is that theoretically possible?
Tim Ruffing [38:50]: Yeah I think this would be possible. Still, it would require even more research I think, because so far, for example, for half aggregation we are sure that we can do it with real Schnorr signatures — you can take all of your Schnorr signatures and half aggregate them. I think we so far — correct me if I’m wrong, Jonas — we haven’t looked at taking Schnorr signatures together with fully aggregated signatures and then, again, half aggregating them, what you just mentioned. But if you ask me, this should be possible! Of course, this requires research, but I don’t see a fundamental reason why this shouldn’t be possible. And the world that you imagine here — I think this is really attractive, because then you could save, as Jonas mentioned, whenever there is already cooperation between the signers because there is a transaction with multiple inputs. We talked about cross-input signature aggregation, right? So what does it actually mean? So we have a transaction with multiple inputs, and the multiple inputs just maybe come from the same signer — this is very simple, because then in the end there’s no real interaction at all. It’s easy for the signer to talk to themselves! Or they are indeed different entities that control the different inputs but they already have some way of doing cooperation, because they anyway want to do this single transaction with multiple inputs — maybe that’s a CoinJoin, maybe that’s it’s another smart contract Lightning channel, or something like this. So if we already have this need for cooperation and this need for interaction, then we could just use full aggregation and this would save on the transaction level. And then for all other cases where we don’t have this interaction, the miner could do the rest of the aggregation using half aggregation and save even more there.
Stephan Livera: Yeah, that’s really fascinating, because when we’re talking and thinking about how it might look if more and more people were to want to use Bitcoin non-custodially — which obviously is the more self-sovereign form — part of that is like, How does everyone have their own Lightning channel? Because we know the number right now in terms of how many UTXOs exist is something around 84 million, so we know that’s kind of like an upper bound of how many people are actually self-custodying. Now yes, there’s certain caveats: of course we know one Coinbase UTXO probably has like millions of users behind it or whatever, or there may be cases like that. But if we want more and more people to be able to use Lightning, then having some way of batch opening and batch transacting helps. And it’s not just a matter of opening the channels — it’s about people being able to maintain things. Like, the channels might get exhausted or extinguished so they have to refill it in some way. And so some of these techniques may be useful! Even if it’s not being used directly by the user, it could be being used by, let’s say, the Lightning service provider, the swap provider, the channel partner. There may be people doing these aggregations to save some space and get more efficiency.
Jonas Nick [42:26]: Yep, exactly. And especially in a high fee environment, I think there would be quite a lot of pressure from users of wallets to implement these things, because if you have the choice between a wallet that is 10% cheaper than another wallet, I think that speaks for the wallet where you get the 10% savings quite a bit.
Stephan Livera: Yeah. And so the other question — well I’m not sure how much you’re probably collaborating with some of the Lightning protocol developers: are they also looking at MuSig2 and things like this? I mean, that’s what I’ve seen.
Jonas Nick [43:15]: Yeah so MuSig2 — again, a bit different topic. Also in a very different stage of development, I would say, because the research is done — kind of. I mean, there are always related problems — problems related to MuSig2 that still need to be solved, but we’ve been working on a Bitcoin Improvement Proposal for quite a while now that is slowly getting into the stage where it becomes stable. And people have been implementing this and playing around with it. Some use it in production already, and the Lightning people are interested in standardizing new ways to open channels in a way where MuSig2 is used. And this is something that is happening right now. But besides these standardization efforts, there have also been Lightning developers who have implemented MuSig2, and try to also provide this to their users, hidden behind an experimental flag.
Stephan Livera: Gotcha. So yeah I was just asking out of curiosity. So back to the half aggregation stuff: what’s the current state of where this is at? And what’s needed for half aggregation to proceed?
Tim Ruffing [44:50]: At the moment we are pretty early in this, so it’s basically done from a research point of view. There’s a paper — not by us — but by great people, who proved that to be secure in a mathematical sense, so we’re pretty convinced that this is a thing we could safely do.
Jonas Nick: Just to be clear: Tim’s talking about half aggregation, right?
Tim Ruffing [45:16]: Right — about half aggregation. Yeah, for full aggregation, we — as Jonas said — we still need more more research work. For half aggregation, Jonas authored the Bitcoin Improvement Proposal draft that contains the basic scheme, the basic algorithm, for people to look at. So far we haven’t received a lot of feedback on this, as far as I know. And in terms of applications, this really just specifies the aggregation algorithm. Maybe it talks about the possible applications, but at the moment it doesn’t propose any concrete application in the sense that it would propose a soft fork or it would propose to add this to Lightning. It’s really just a specification for the aggregation scheme itself, for the cryptographic thing. So if you’re talking about blockwide aggregation, we’re in a super early stage, because there is not even a concrete proposal for this.
Jonas Nick [46:34]: Yeah, and the reason to produce this half aggregation BIP was also just to collect all the information that we have at this point about half aggregation, how does it work, in detail, so people can play around with it. And also as a starting point for discussion on how to actually do a blockwide aggregation, because if you don’t have the cryptographic scheme specified, it’s really hard to actually have a common understanding of what you’re actually talking about.
Tim Ruffing [47:09]: Yeah, not even how we want to do this or even if we want to do this at all, right? I mean, there are also drawbacks. I think we should pursue this. So I believe in half aggregation, but for example: one drawback — besides the adaptor signature thing that we already talked about — is that, while it saves space, the computation time needed to verify a half aggregated signature is basically as expensive as verifying all the individual signatures. Now, you could say, Okay that’s actually not bad — we’re not making it worse! But you could argue we’re making it worse because now if we manage to fit more transactions and more signatures into a single block, and each of these signatures needs as much time to process as before, then the overall amount of computation time that verifiers need to spend to verify the validity of a block now increases. And this is usually something that we want to avoid, because we want to keep the verification cost low to be able to make sure that everybody can run a cheap node, which in turn increases decentralization and so on. So this is, for example, one trade-off that the community needs to talk about and think about.
Stephan Livera: Yeah, okay. So as I’m summarizing then: you’re saying it may raise the cost to run a Bitcoin node, but it may not be beyond an average western world person with a normal computer — that it wouldn’t be beyond that. And what we’re talking about, to be clear here, is the ability to sync that node from start, even though the default in most Bitcoin Core is Assume Valid — so it’s only up to a certain point onwards that it actually verifies all the signatures, that your Bitcoin node is doing that. And then also the ongoing cost: as every new block comes in, now your node has to crunch through and do extra validation that it was not previously doing, but the benefit is we’re getting more transactions for less, so it’s a trade-off.
Tim Ruffing [49:38]: Right. So I don’t have the numbers with me to run the numbers, but I don’t think it’s a big increase of computation time. Or it’s [not] something that is really an obstacle. But it’s an increase so there are always trade-offs, so we need to talk about it. In the end — if you ask me now? My feeling is that it’s worth it to do it because we get all the savings in terms of space, even though we pay a little bit more for verification cost. But I think it makes sense.
Stephan Livera: And then there’s that other one around exploring what’s the cost around adaptor signature use cases and if there’s the secret adaptor signature use case out there that we don’t know about, and that person or group of people losing their use case somehow.
Jonas Nick [50:29]: Yeah that’s another point. And then there are questions on general soft fork mechanics, of course, that we probably don’t want to get into. But it’s generally hard to find consensus on these proposals and soft fork proposals, as we know, so that’s something to consider as well. So I don’t think at this point that it’s for certain that this will land in Bitcoin, for example.
Tim Ruffing [50:58]: And also: even if there’s agreement in the community that we want to have this, it’s also a question of priorities, right? Doing soft forks is complicated! It takes time, and maybe there are other features that people want to see first before they consider half aggregation, and maybe this would push it even further down the road.
Jonas Nick [51:24]: As I view it, we’ve been mostly talking about blockwide half aggregation right now, but there are different applications of half aggregations also in consensus. So for example: as I mentioned briefly, there could be Opcodes — like, Bitcoin script Opcodes — that take multiple signatures. And here you could also apply half aggregation! And now, just within that single Opcode, this would reduce the complexity and how to think about it quite a bit, because now you only have to think about the single execution of this Opcode. So for example: one Opcode that was proposed is called op_evict, and it’s used in payment pools to throw out users if they misbehave or something like that. And here you would actually have multiple signatures, so you could half aggregate them and just save space by doing that. Of course, the savings won’t be as great as with blockwide aggregation, but there are some savings to be had in a much more constrained environment.
Stephan Livera: I see. So it could apply in some of these other ideas for scaling and things. And maybe the people interested in privacy might see some benefit in this also? So maybe the people focused on privacy might see some reason to push for this change as opposed to other ones. I suppose those are probably the main ones. So basically it’s just scalability and maybe some privacy benefit are the main aspects.
Jonas Nick [53:14]: Just to give some numbers: for example, now back to cross-input signature aggregation — just within a transaction. So if you would have a single fully aggregated signature in a transaction and you create an infinitely large CoinJoin — or just a large CoinJoin, I mean it approaches that number — then you would save 40% in terms of bytes and 15% in terms of weight units, compared to doing a regular transaction. So it is significant, I would say!
Stephan Livera: When you say regular, are we talking about regular SegWit? Like, just a standard SegWit spend?
Jonas Nick [53:56]: Yes, a standard SegWit transaction with like two outputs: one to the merchant or whatever and one change output, with 2.3 inputs on average.
Tim Ruffing [54:11]: Yeah this is an interesting thing maybe to point out: this is only true for full aggregation, but that full aggregation would incentivize you to do CoinJoins, because now you can save transaction fees. Because, while running the CoinJoin, you anyway need some interaction and you would then also do the full signature aggregation, and then the resulting transaction would be more compact. And in particular, for large CoinJoins, the effect is then larger. So there would be an incentive to do CoinJoins, which is a pretty nice thing.
Stephan Livera: Yep. And so it would just depend on which model is used, what wallet, and how it all works, and dealing with the interactivity part of that. And so maybe for some people they would say, Oh that’s too much effort for me! But other people might say, No actually it’s worth it for me — I’ll deal with the interactivity aspects of it because I’m getting a saving, and maybe there’s a bit of privacy involved there too.
Tim Ruffing [55:18]: It also gives you a good argument why you [should] do CoinJoins: because it saves fees — simple as that!
Jonas Nick: No malicious purpose required.
Tim Ruffing: Much less suspicion to do CoinJoins because there’s a good reason to do it.
Stephan Livera: Yeah and we shouldn’t see privacy as a bad thing anyway, so I’m not really worried about that part! I’m not saying you are, either. But I mean we close the door when we go to the toilet — it’s not a bad thing! People use SSL or they use VPNs — they use all kinds of things.
Tim Ruffing: Perfectly right, yeah. Maybe this is an interesting anecdote: I did a lot of research on privacy earlier in Bitcoin and particular on CoinJoin. And at some point, somebody on Twitter — I mean, I don’t want to go into political debate now, but — I claimed on Twitter that I think taxes are a good idea. We can talk about this but this is not the point here. And then somebody asked me, Okay but what do you think all your privacy technologies will be used for except sex and evasion? And I said, Okay but I mean people need privacy for various reasons and I don’t even know them, right? There are a thousand reasons why people want and need privacy! And this is really my opinion.
Stephan Livera: Okay, so let’s leave listeners with something to chew on or something to think about: what should they be thinking about when they’re thinking of half signature aggregation? Jonas?
Jonas Nick: The more technical listeners could think about adaptor signatures that are probably going to be broken by a half aggregation or something like that. The general Bitcoin users who listen to this podcast, they have a say in how Bitcoin develops and in what direction Bitcoin develops, whether we should accept certain soft forks or don’t. So, people should try to stay informed of what is happening in this space and try to compare the different trade-offs at least up to a certain level of detail. For the proposals we talked about, they are still pretty far from getting into Bitcoin consensus, so we couldn’t create a table right now that would compare all the differences and the pros and cons, but I think that might be something that’s going to happen. I think we’re starting to kind of get a base for discussion. We are doing the research for half aggregation, the specification, such that we get to the application phase and thinking about how to add this to consensus. And I guess at some point we will have proposals and then people should be ready to evaluate them.
Stephan Livera: And Tim any final points you want to mention?
Tim Ruffing: I think this was a very good summary!
Stephan Livera: Okay, fantastic. Well listeners, I’ll put all the links in the show notes. Make sure you follow Jonas and Tim on Twitter and check out some of the work. Obviously, all the links will be in the show notes. So thanks guys for joining me.
Jonas Nick: Cool thanks. Thank you.
Tim Ruffing: Yeah, thanks for having us.