Home < Bitcoin Core Dev Tech < Bitcoin Core Dev Tech 2018 (Mar) < Taproot, Graftroot, Etc

Taproot, Graftroot, Etc

Date: March 6, 2018

Transcript By: Bryan Bishop

Tags: Contract protocols, Taproot

https://twitter.com/kanzure/status/972468121046061056

Graftroot

The idea of graftroot is that in every contract there is a superset of people that can spend the money. This assumption is not always true but it’s almost always true. Say you want to lock up these coins for a year, without any conditionals to it, then it doesn’t work. But assume you have– pubkey recovery? No… pubkey recovery is inherently incompatible with any form of aggregation, and aggregation is far superior. You could, if you didn’t make it aggregate.

In graftroot, if all the participants agree, then they can just spend. So they can do pubkey aggregation on P. P represents the set of all the people who have the authority to spend the coins, with no other conditions, no hashlocks, no timelocks, etc.

Any time all the participants in this script are present at timing time, you use Sig(P, TX) no matter what the condition is. It’s only when those people want to delegate the right to sign to anything else. This is a replacement for MAST constructions. This is just 64-bytes regardless of how many branches or ways there are to spend the coins. It’s just 64 bytes, they sign off on it and say you can spend it this way or whatever. There’s some logistical overhead to remember all of those signatures or whatever.. .

There’s a batch aggregation trick. If you have multiple (R, S) signatures.. … sG is R plus hash of r1 let’s ignore the message for a second… something, it doesn’t really matter, l.. … do you know how batch validaiton works? Aggregating S values. I can add them up.

What you see here is… all these g coefficients are combined into a single one. Now what if instead of alpha being an actual random number, it’s a hash of all the inputs, except s. So you do a fiat-shamir like construction where I say instead of it being a random number chosen by the verifier, it’s actually a number that the signer knows about and is actually derived as a PRNG output seeded by the whole signature except s. And actually I think it’s enough to seed it with just the R values. Then what if you do, is instead of getting s1 and s2, you could just give you this value. The verifier now with just r1, r2, and s, it can validate this whole– where s is defined as this.. And the verifier still computes alpha, and it multiplies it here and here, and thus it has a single equation but it doesn’t need to be told about all the s values. So now the signature is 32 bytes smaller than otherwise. This is a non-interactive half aggregation scheme for Schnorr. It only aggregates the S values but not the r-values. But it means that graftroot only has 32 bytes extra not 64 bytes.

Both of the signatures here have their own R value, but there’s just a single s globally for the entire transaction. The public key is in the outputs. Graftroot is a signature using that public key and a script as the message, the script itself, and an input to the scripts. The input will have the pubkey without the hash— it doesn’t matter actually no.

This signature is really just an R value, and there’s a global all transaction S value… because this aggregation is non-interactive. I can take two valid signatures and not having their private key, combine them, because the alpha just depends on public data, it’s a value generated deterministically remember, so it can be done by anyone.

Doing this at the same time as the cross-input aggregation Schnorr stuff– they all combine very nicely. There’s a synergy between them. This gives you arbitrarily many script branches, for just 32 bytes.

Yes, I am explaining graftroot.

I want to prove that graftroot is secure. And taproot and Schnorr at the same time. You can’t add more things to aggregate in a soft-fork, so you should do this all at once. The problem is that you’ll have a transaction with signatures and then you have graftroot output with no script, and it wont have a public key inside its script, that signature can’t be aggregated with all the other ones, if … it’s in a soft-fork.. because, there’s no, aggregate signature with all of them, and not be able to validate it because it doesn’t see that public key. You’d have two aggregate signatures rather than one. The benefit is much smaller if you need to have a new aggregate signature for every soft-fork version. So I want them to get in all at once. I want merkle trees.

Back to graftroot

First I come up with my script. Then I make a random signature– a hash of a different… and then from that signature script pair, I compute the pubkey. And then I set that pubkey. I can show you that the– the preimage for this signature, so this is basically proving you that…. there’s no… that might be possible. you could do more than one.

….

Given two messages and a single fixed signature, find me a public key… Given two messages, find me one public key that signs both of them, unless you know the… I believe I have an informal proof that..

You should probably never send money to people who aren’t expecting to receive the money. But people probably will stil do it anyway.

There’s always a global signature part of the transaction. There is a signature somewhere in the transaction. It’s all aggregated into the signature. …

A graftroot where you can non-interactively send to someone’s multisig… to some arbitrary script for multisig with keys in it. You could do, but you do the direct tihng. I want a 2-of-3 multisig script and I want to do it without asking the people to sign anything or do any graftroot. You can’t. Are you sure? You can, with the recovery tricky, as long as there’s at least a single script. It might not work though…. this signature doesnt’ sign the txid, it only signs the scripts. It works. But only as long as there’s a single… and if you’re doing that, why aren’t you using taproot? It’s cheaper. Only reason why you’re doing it is because you want to graduate from taproot to graftroot… for fungibility reasons you.. Wouldn’t it be cool to merge it all though? Say I have a public key, and I have a script, and fixed upfront, and you can’t. You want to do graftroot. If P is fixed, then you can’t. But if you just want to send to a script, then you could use graftroot. You want to non-interactively send to a script… we want to non-interactively send to a script which has no relation to this pubkey that it has no relationship to, but that’s impossible. You’re saying, it’s unusual that you would have a public key and a script upfront. If you don’t have non-interactivity, then graftroot does everything. 2-of-3 multisig– 3 possible scripts now. That’s not as efficient as having 3 scripts.. then again, 2-of-3… In graftroot, you should be sending to a collection of pubkeys, and they can decide to delegate to particular scripts. As long as you have this collection of public keys, it’s completely non-interactive.

Taproot and graftroot

Taproot: P = c + H(c || script) G

Graftroot: sigp(script)

An output is a point p. c is some other point. There is some other point, and its private key may not exist.

Direct spending is sigp(tx).

Taproot spending…

The scriptpubkey becomes an actual pubkey, that’s the discussion about whether it should be a pubkey or a pubkeyhash. Once we have quantum computers….

An output is a public key and version number. There are three different ways to spend it- either sign with the key and there’s a transaction which is the most common situation. Any single key. It’s just a signature… you may want to use musig to construct the public key but that’s only one of the possibilities, and it’s up to the wallet. You have given authorization to this public key to spend this output. There’s a signature with that key that is being spent. Sometimes it’s not a public key, though. Yes. Calling it a public key is probably….. it’s a public key in that it’s a scriptpubkey, someone with a quantum computer can spend it.

Taproot is– after the fact, we reveal that this public key that was in the output is not actually a direct public key, but it was in fact a public key that was derived from another elliptic curve point and a script. This function, which takes C and S, this is the pay-to-contract construction. You can see it as a hash function from points and scripts to points. Given a value p, there is no way of giving another C and S other than the C and S it was constructed with. It’s a binding commitment, which means that this public key, that I am revealing to the world that this pubkey was created with the knowledge that the script was in it. Taproot is the idea that if you do that, then the script is permitted as spending it as well. Another way to see it is that taproot is combining pay-to-pubkey and pay-to-scripthash into a single public key with the advantage that unless you’re actually using the script version, you’re not even revealing that there was a script. It’s both an efficiency and a fungibility gain.

In(s) is the inputs to S. If you’re spending through script, you need to solve the script, the script may involve checksigs, in which case you have signatures.

G is the regular generator. C is… there’s two possibilities here. These are just the standing rules. In terms of how to construct it, there are two ways. One is, I actually have a public key that is allowed to spend, and I actually have a scrpit allowed to spend, and C is that public key. P is the combined public key that encapsulates both policies. Another possibility is when you…. C is some hash function that maps two points directly. If I just have a script, and I want to make it look like a taproot output, in fact i don’t want to reveal— when spending it, I’m always going to reveal a script, but I’m not going to reveal there’s no point to begin with, you can generate C as a random point on the curve, you can prove to people that you generated C in a particular way so that you can’t spend it except through a particular script. You say, I take a random string, I hash it, I use that hash as the x-coord for C, and by revealing the preimage of that hash, you know that nobody knows the discrete logarithm. The overhead is low.

The third way is graftroot. Let me take a step back. The reason why taproot seems appealing is because in general you may I think it’s a reasonable assumption but people who know more about higher-level contracts, feel free to chime in. I think there’s a reasonable case to make that in most smart contract constructions there will generally be a group of people that are jointly permitted to spend the coins unconditionally. When they agree to spend it, it’s fine. Generally, at the very least, this can be everyone involved in the contract. And at worst, this just never gets used. There exists a group of people who are authorized to sign off on things. Knowing that there are key setup protocols that do k-of-n or even more complex constructions, there’s a single key that could also be incorporated in there, we could do a 2-of-3 multisig in there, as a single key in there. Taproot is advantageous whenever we can say, well, in the regular case, everyone will agree, and we sign off, and we don’t actually need to reveal the script that was used. Taproot merges that case with a standard p2pk equivalent so that you cannot distinguish these two cases anymore.

An extra step is that if we already have this assumption that generally there’s a group of people that can spend together, then well what if we make them responsible for deciding what other scripts can be delegated to? This is effectively one level of delegation where you say, I’m just going to– we have this group of people who are involved, they sign off on this, and “well after this time has lapsed, then you can sign too” they give you a signature on that script and now you can spend by revealing that signature plus the script and plus the inputs to that script. The very interesting thing here is that there can be an arbitrary number of these graftroots.

petertodd says he has written this up before, the DEX stuff. We talked about delegation in this context. Yes I remember that. But graftroot has a half-delegation thing, which is an optimization.

This is I guess one of the downsides… you can know that there’s no way of spending outside of your knowledge exists, if you are one of the participants in the key and you refuse to sign something that doesn’t involve you already, but generally you cannot prove offline to someone that the entire policy is “this”, because it exists as cloudy stuff.

How would this be done with multisig? There are– if you are fine with using a key setup protocol. The assumption here is that you have a group of people that know each other, they have encrypted authenticated connections to each other, and they want to setup a multisig key. They can run a protocol to decide on what P will be. There are protocols by which you can encode any multisignature scheme as just a single key, and it doesn’t need musig, but it could. Can you delegate each of those keys somehow? This wouldn’t involve graftroot at all. It’s just a key. Yes you can do delegation with graftroot. Anyone that is authorized to spend the money directly, can authorize someone else, can authorize a script. You can do rotation, a new 2-of-3 with the old key rotated maybe for example. Someone could give their private key to someone else, which is another form of delegation, that’s definitely compatible with graftroot.

There might be a byte before the script to say that it’s a taproot/graftroot. Say you do this as a new segwit address type, you could have a new version byte.

How this interacts with signature aggregation is something where these two proposals become extra interesting. This signature in the Input() functions here, can be interactively aggregated into a single signature. Instead of actually having a signature there and there and there, there is just somewhere in the transaction there is “the” signature for the transaction or possibly multiple. In the most general case, these can all be combined into one signature. Which means that the marginal cost for an extra direct spend is exactly zero bytes. The script witness for such an input would just be empty, because it is entirely encapsulated in the aggregated signature for the transaction. It’s not zero bytes, it’s zero bytes plus the transaction input (txid).

What can’t be interactively aggregated, is the one in graftroot here, you could do it but it would be stupid. It implies the signers are online, they can just do a direct spend in the first place. The only case where you want to use the graftroot, is really when P, whoever it may refer to, is not online, but you are and with the people involved in S online. Maybe they have a court order preventing them from spending the transaction– perhaps the world already knows it’s a graftroot so maybe you want to spend it through graftroot too. It only makes sense when P is not online, and as a result it cannot be interactively aggregated… However, it can be non-interactively half-aggregated. This is a construction where if we have multiple signatures in the form (R, S) and form (R, S) and so on… … this is just a general conversion, and this can be done non-interactively, meaning, different groups of people can produce different R-S pairs and anyone can combine them into multiple R values but just a single S value. The marginal cost of this thing is now 32 bytes, in the graftroot construction. Okay, maybe 32.125 bytes because of the one extra bit.

An additional benefit here is that after the half-aggregation is done, nobody can get the original S value out of this. If someone later sends money to the same original graftroot address, they can’t find a delegation from an old transaction and pull it out. But if they were involved in that transaction which is the only case in which they would benefit from doing so, then they would already have the signature anyway. But it’s nice to see this is bound together and this thing cannot be decomposed into the original R and S values.

Maybe in graftroot you should add something to prevent replays. Meshcollider suggested something where, this thing could sign not just the script but it could also sign the txid and output index of the utxo being spent which has the downside that you could only create these signatures after the transaction has occurred, but it makes it completely non-replayable, which makes reuse of the address safe, not sure if this is a desirable property but there it is.

If the graftroot key is used more than once, even if the delegation has not come along with it, if you have knowledge of these signatures, you can do replay of the graftroot signature. But a third-party can’t recover it with…. If you knew all the R and S values, you could reuse this. If you’re one of the people in P, and you build it, then you probably know it anyway.

Downsides of graftroot… so, taproot seems like such an– in the case, I have a public key that can spend and a script that can spend, this is an improvement over what we already have. This may not be obvious– whenever we are talking about multiple signatures in a transaction or a block, one aim is batch validation where we can validate multiple groups of signatures at once, even when they are not aggregated already… The taproot equation that we have to validate at the time that it is being spent, can be batch validated together with signatures even though it’s not a signature, it does involve a relatively expensive elliptic curve multiplication. Otherwise, you could argue it’s a net negative due to the cost of this multiplication which is worse than just a hash. With batching, this is much better. We’re talking about batch validation of multiple signatures.

Key aggregation which is, we’re going to combine multiple keys together in such a way that only when all the relevant private keys are online, they could together produce a signature for that key.

Signature aggregation is multiple people, all of them are going to sign their own message, but we will combine all the signatures into a single signature. Single signature, multiple keys, multiple messages.

Batch validation has multiple signatures (message, signature, key) pairs and just as a computational improvement, I’m validating all of them at once, and I don’t care which one fails, I just care whether they are all valid or are any of them invalid. Batch validation is many times faster than validating individual signatures. This is a very non-trivial factor. It’s not just “add all the hashes together”, at least to do it securely. Multiply each one by a different random number, and then add them together.

A clear downside of graftroot is that it requires an interactive protocol. The multiplication by G is expensive, except for the batch validation. For example, you can’t do the thing where, right now, three people each have their bip32 extended public key chain and I’m going to automatically do a 2-of-3 out of corresponding keys along them, and I think there’s a bip that proposes this and there’s software out there that work that way… In exchanges, I don’t think people have done this in production, that have 2-out-of-3 but it degrades after some amount of time with backup keys or for key recovery. Doing this kind of mechanical conversion of sets of keys and turning it into a script, can’t be done with graftroot. Each instance involves requiring interacting with the group involved. It’s using delegation for something where you wouldn’t be using delegation for in the first place. If you want MAST with a million different scripts, I guess. You could do tree sigs inside of the …. But this is O(1). You don’t want a tree. This has much better properties in graftroot, if you can do it. It’s unfair to say you only do this when you want delegation- it’s more like, you’re leveraging delegation to implement various spending policies.

Delegation is this thing that has been talked about for a while, and certainly has use cases… but graftroot turns it into a real way of how we can deal with complex spending policies. Changes it from a fringe smart contract offline participants, to we can do massively scalable multisignature with hashlocks and timelocks involved. 32 bytes. Versus 120 bytes? or 300 bytes? That’s enormous, maaku.

Privacy is nice because you can get everyone using the same thing. You don’t reveal anything about the depth of the tree or whatever. Scalability-wise, you get a lot of interactivity in here, so if you have a lot of signers… If you’re like, hey I want to make a million scripts in here, and I got a lot of participants, then this is a lot of signing and a lot of interactive signing and a lot of storage for all of the scripts. You need to store it once you’re participating. You might be able to reconstruct the scripts- the signatures you can’t reconstruct.

In terms of deployment of taproot and graftroot, these could co-exist since they don’t interfere with each other. I think there should be an output type that could be spent by any of these three mechanisms.

A minor optimization- when you were talking about a taproot from a nothing up my sleeve point.. The x-coord is a valid public key but nobody knows the discrete logarithm to. You wouldn’t want to use this for fungibility reasons, because you’re revealing the fact there is no key. So you should just use p2sh in that case, because you’re going to reveal it anyway.

You could make a taproot… and then with that taproot P, sign a proper signature, so that you can have a public key out there.. but all three of these could be possible. I don’t know why. But you could do that. It makes you wonder- should you permit a taproot spending against a delegated graftroot? You could imagine a scripting language with an IF branch, and it would be “if pubkey else another pubkey” and the encoding of the branches would be that. You wouldn’t use a scripting language– you would just compose these types independently. But suppose you want to specify some conditional thing, where you want to delegate to a set of keys controlled by something else. It wont necessarily be simple.

What about recursion of graftroot? This ends up being a merkleized abstract syntax tree (MAST). Either a public key, or a relative locktime. This is essentially hash functions. Recursive graftroot would be really slow for MASTs. At each recursion, you require an elliptic curve operation. Almost always, you will have just a single level. A MAST construction built up from hashes will be more efficient than using graftroot as a node in a tree, because of the elliptic curve operation at each level in the graftroot tree case. Even with recursive taproot, you’re doing recursive amounts of elliptic curve operations.

You could turn a taproot into a graftroot. It is similar to PGP. It’s literally the PGP design to go do this. You have an identity key, and it signs off on the permission to use something under your name. This is why petertodd did DEX stuff.

Say you have a soft-fork that adds signature aggregation. And then you do a soft-fork for graftroot. You can’t aggregate across the two types… you can’t do it. Well, you can, but let’s not do that. We need to keep in mind that this is hard. This may be harder than it looks. Don’t aggregate across different versions- but how do you do it? How do you make it sure that you don’t accidentally do so? Your UTXO set with version bytes saying version 1, version 2. I hope we’re not going to be restricted in the future to 16 soft-forks. Here’s the thing, there has been this idea in the past…. here is what I was hoping to propose, until it was pointed out that this would not work– the version byte going into the first byte for the witness, and the version for the output is just the hash? That was what you said a few years ago. No, thinking changes, it takes time. Okay, okay. So, the thing you could do is, say, we have this whole range of unused opcodes. We introduce v1 witness output type. And it is all these things, signature aggregation, checksig, checkmultisig that integrates with it, tparoot, graftroot, and all the things. We do all those things, and agg sig, graftroot, and taproot, and we select this range in the script opcode list and all of those opcodes are redone to be “return true” opcodes. And now, so, so far, no probloem, there’s no reason why you couldn’t do this. But now you want to do a second soft-fork that reclassifies– the soft-forks in one of those opcodes as “anything”, some boring thing, it could be a “return true” to “return false”. Stupid soft-fork, but you could do it. No, actually, it needs to become a conditional. It’s not– it could fail, but it needs to become a “check whether there are N transactions in the transaction”. Maybe push stuff into the stack. It’s causing a problem with this other stuff.. if it’s “op true” to “op return false” it would still work as a soft-fork but why would you do it in the first place. It’s not an issue, but it’s dumb thing to do there. But anyway, say we turn one of those “op reutnr true” to “op merkle something”.. any checksig, you can have a script with a checksig after this merkleverify. That checksig cannot be aggregated with the ones before. That is to be expected. Where do you indicate the… you’re going to have nodes that don’t know about this…. My original idea was that, we would not have just single signature for the whole transaction, there would be a number of them, they would have a bucket number, and every signature you do says which bucket it belongs to, and that the basically, every new soft-fork that potentially changes the execution introduces a new range of buckets. Like a new number. Once you’re in an execution path that depends on this particular soft-fork, you’re implicitly even in that group, you have to make it implicit. So first buckets 0-16, and then 16-32 buckets, and it would still be called 0-16 buckets, but it’s just in a different regime for the script, for each soft-fork regime. It’s not a problem that you can’t aggregate across soft-forks (you could do it, but you shouldn’t, I do know how to make it work though). However, you still need to prevent someone from creating something like a naieve implementation would do the aggregation wrong. Their implementation might even be inconsistent with itself.

The witness outputs are up to 40 bytes, we can treat the last 8 bytes as more version number. That’s the reason it’s there. You could just say, the v1’s are aggregated here, the v2’s are aggregated there, the v3’s are aggregated over there… And then we don’t need to do any of this conditional execution bloodbath.

How many soft-forks are you looking for? If you couple signature aggregation with taproot and graftroot, then it could take longer. By the time we get to v16, we’re going to be dead. Soft-forks could get increasingly harder to deploy.