Architecture of LND Watchtowers
Speakers: Conner Fromknecht
Date: April 6, 2019
Transcript By: Michael Folkson
Tags: Research, Lightning, Watchtowers, Lnd
Media: https://www.youtube.com/watch?v=2tyr05tLF4g
https://twitter.com/kanzure/status/1117427512081100800
discussion: https://www.reddit.com/r/lightningdevs/comments/bd33cp/architecture_of_lnd_watchtowers_presentation_from/
Intro
Pretty excited today to speak on the topic of watchtowers. It is something that we at Lightning Labs have been working on for almost a year now. Conceptual proof of concept code to really flushing it out into what would be today more of a full on protocol. Hopefully today we’re going to summarize a year of research into this presentation. I’m looking forward to getting into it with you guys so let’s dive right in.
Commitment Outputs
So today we’re going to talk about LND watchtowers. Some background first. Before we jump straight into defining what the watchtower problem is and more generally how they work, we’re first going to do some background on the primitive components that we’re working with here. The first one is commitment outputs. When you open a channel there is a funding output onchain that has a certain capacity. Let’s say it has 70K sats in it. At the same time when you create the channel you create two transactions that spend from it. They will have asymmetric balances as you’ll see. At any given state n, my commitment balance might look like this. The green represents my balance, 50K satoshis that is locked with a pay-to-witness-script-hash output (P2WSH) and the remote party’s funds are locked with a pay-to-witness-key-hash (P2WKH) output and that has the remaining balance of 20K sats ignoring all the fees. Similarly, their commitment transaction will be a mirror image of it. Notice the commitment output types. Where on my commitment I have a P2WSH, they have a P2WKH with my 50K sats and vice versa. They have a P2WSH with 20K sats that has their remote balance. They are mirror images of each other and the contracts are flipped because from either party’s perspective the big P2WSH one is the one sending money to them and the P2WKH is the one sending money to the other person.
What is a Breach?
So what is a breach? A breach is when after we’ve agreed on some state N, the remote party, it is always from the perspective of me, broadcasts a state that is less than N. These have been revoked and we’ve given up a commitment secret that allows the other person to spend them. The reasons they might do this is they might have more funds in that state. Maybe they are trying to maliciously cheat you. Or that there’s a possible implementation error in one of the clients. Or just user error. We’ve seen people restore from old backups. Most of the cases so far have been user error that we’ve seen in the wild. I don’t think we’ve seen anyone actually intentionally breach people but we can’t really tell. Let’s say their commitment state i has a lot more balance in their output. That’s the P2WSH output with 60K. They’re trying to sweep this maliciously and take more balance for themselves. In state N I had state 50K sats. What lnd would normally do or any of the implementations currently out there is when they see this transaction broadcast onchain they’d recognize that it’s a revoked state and they will immediately try to spend all the outputs back into their own wallet. The biggest issue here is that their to-local output is locked with a CSV, it’s not a standard P2WKH script or pay-to-pubkey. There’s a timelock on it and after the timelock expires that person is able to spend it back to their wallet. There is this window of contention where you have the ability to rectify a breach and after that period the remote party can also take their funds. That is the critical window that we have to deal with. It is not necessarily critical that you sweep the two remote outputs. If it is a P2PKH that you have the secret keys to that’s not as time sensitive as a to-local output but in the lnd watchtower implementation we also sweep it for you because in the case of data loss or something you might not be able to so we sweep both of these outputs back to your wallet automatically.
Problem
The problem here is that your node has to be online and operational to be able to respond to breaches in this manner. lnd has this subsystem called the breach arbiter which basically monitors for these onchain and will take all the outputs that are on that transaction and try to spend them back to your wallet. If you’re not online then you can have a situation where you miss the breach entirely and the remote party is able to spend the CSV outputs back to their wallet. When can these assumptions fail? You can have a node failure, a db corruption. We’ve seen some db corruption in the wild. If you have that your channels can be vulnerable for that period of time. You could forget to pay your AWS bill or you delete your entire… Specifically on mobile there’s some extra concerns that come to mind. Primarily that your phone might run out of battery. You go on a hike for a weekend and you don’t have internet access. Or you forget to open the app for a couple of weeks and you miss the breach because the app was never aware of the thing that happened onchain. The reason I bring mobile in specifically separate from in general is because all of the cases on the left can happen to any Lightning node and watchtowers will help against that. On mobile the issues become even more complex because of this intermittent starting and stopping of the app, limited reliability and intermittent connectivity. The watchtowers really come into play to support mobile in a lot of ways. They definitely help the other case but mobile is where they are most critical I would say just because of this infrequent use. They basically offer you a safe way to run a mobile phone and maybe not be online for a while and still have some insurance on your channel.
Solution: Watchtowers
The solution that watchtowers provide is that you are going to delegate a highly available party to detect these breaches and respond to them on your behalf. The name watchtower obviously comes from some medieval thing where some guy sat up in a tower and watched for the approaching enemy and then would ring the town hall bell or something like that. In this case they are broadcasting a Bitcoin transaction when they see the enemy. Some constraints on what our final approach needs to be. We don’t want the watchtower to have signing keys from the client, particularly the keys that spend those outputs from the prior slides. The watchtower should not be able to willy nilly send those funds anywhere it wants. That’s important for maintaining the trustlessness of the towers and making sure the keys never leave the client phone. There can also be no interaction with the client. Once I send you the info you need I should never have to interact with you after that point. If the tower had to ask while the breach is happening it kind of defeats the whole point because we’re assuming this intermittent connectivity assumption. I think the real tricky tradeoffs come in when you consider privacy. Also to maintain this “sufficient level of privacy”. When you think about it, the requirement of a watchtower is when you revoke a state you back something up to this other server more or less. It is inherently like a timing side channel. Not a timing side channel in the cryptographic sense, more of a financial timing side channel. If someone knows that its you backing up the state they can see “Oh Conner made a transaction at 11:35 on Saturday.” All of those tradeoffs really come into mind if you are going to have a service that does this we want to ensure the utmost privacy or ensure we can provide the utmost privacy for clients in a way that protects their anonymity and their privacy from the services that are providing a tower service. That includes not leaking the channels that the client owns, particularly obfuscating the client identity. You want to mitigate the timing side channel on your actual payments and financial privacy. There is also the question when you sweep onchain how do we make sure we maintain the good practice of not doing address reuse? And making sure that when we give the tower the ability to sweep funds to a particular address that we don’t give up privacy either. If towers collude or by nature of spending to the same address twice. Finally there are a couple of things where you want to provide network level anonymization like Tor support. The current watchtower protocol in lnd is abstracted in the sense that you can drop in a TCP connection or a Tor connection and everything should work the same. This brings us to this question. The design space of this problem is really, really massive. There are a number of different directions you could take this. Given how much you weight the constraints or concerns for not giving up keys, privacy on various metrics. There’s probably 10-12 dimensions of privacy that you want to be aware of at least and then make specific tradeoffs. The implementation we have tries to make the best of all those.
Spending UTXOs Without Keys
When you want to spend somebody else’s UTXOs but don’t have the keys, there are a couple of things that you’re going to need. Predominantly you need to know the inputs being swept. In those first couple of slides we’re going to be sweeping the to-local and to-remote outputs of the remote party’s commitment transaction. When you think of needing to fill out this transaction to get it to broadcast, first you have to know which inputs are being spent, where is the money coming from. I then have to be able to know the witness script for each of those inputs. Because they are both SegWit those inputs will have a hash in their UTXOs that is the hash of the script that is going to be redeemed. I need to be able to reconstruct that to satisfy the hash so that I can execute the program. Then I’ll need to know the outputs on that: the destination and the amount of the input funds minus fees, where does that money get split up and how is it divided? Finally I need to provide a valid witness. I have the script and now I provide a signature and possibly pub keys that would satisfy the script and allow the funds to be spent. If you have all those things, at least in the context of the watchtowers here, you’ll be able to broadcast the transaction and it should be confirmed, sweeping the funds back to the intended target.
Revoking to-local Outputs
We’re going to talk about the to-local output a little more because it is slightly more complex than the to-remote which is a pay-to-witness-public-key-hash (P2WPKH). The to-local script looks like this. It is a simple if statement and you have a revocation pubkey in the first clause. The second clause is a delayed output sending money back to the other party. The way you spend this on the revocation path is you provide at the top of the stack which is the bottom in this diagram, the to-local script itself and then the OP_1 allows it to take the first branch. Finally the revocation signature validates under the revocation pubkey. The items in green here are information that the tower cannot know on its own. The pubkeys used here are random 33 byte values more or less. The to_self_delay is a uint32. It is going to have a tough time knowing these values on its own. The client has to be able to provide these to the tower in a space efficient way such that when it needs to sweep it can fill in this template and spend the to-local output. We’re going to take a quick detour into some of the watchtower code itself.
https://github.com/lightningnetwork/lnd/blob/master/watchtower/blob/justice_kit.go#L107
This is the justice kit which we’ll get to later on but it forms the basis of the payload that goes into an encrypted blob. You’ll see the first… RevocationPubKey, LocalDelayPubKey and the CSVDelay. These are those three parameters that were in the to-local script that need to be plugged in so that you can have the valid witness script. You provide a signature under the RevocationPubKey that allows you to construct a valid witness. Similarly for the remote outputs you have a PubKey which is more or less the script in a P2WPKH and finally a signature under that. These things allow you to reconstruct the witnesses and the witness scripts that spend the remote party’s commitment transaction. I’ll get to the SweepAddress in a minute but note that the SweepAddress that you want your funds to go back to is also included in this…
Session Policies
Now we’ll go onto Session Policies. It should be clear now that I need to send information to the tower that it needs to store in this timelock manner. I want to send this to the tower. At some point later in the future, when it sees the remote party’s commitment transaction onchain it will need to reconstruct the transaction from the information that I’ve given it prior. On that slide where I listed out the four things that you need to be able to spend the transaction we’ve hit most of them but the one thing we haven’t really covered yet is the output values. For the most part that is dictated by what’s called a session policy. The session policy specifies these five fields at the moment, it can be extended. The most critical ones here are the last three, the SweepFeeRate, the RewardBase and the RewardRate. What this does is it fixes this function of the commitment transaction that got broadcast and allows you to compute the output values. Most importantly for example the SweepFeeRate, you add up the balance of the to-local and/or the to-remote output and you subtract out the SweepFeeRate from that. That gives you a remaining balance. Depending on whether or not the justice transaction has a reward which can be toggled via a bit in this BlobType, the reward will also be subtracted out and this uses a base fee so you can say “I want 10,000 sats + 1%” so you get full control over the reward output. It also reduces the amount of per-update storage. This function needs to be computed for every transaction. By including those primaries in the session I don’t have to send this SweepFeeRate, RewardBase and RewardRate with every backup that I send. I should note that a session encompasses an array of slots more or less. I will request from the tower a thousand slots. They will all have this fixed SweepFeeRate, RewardBase and RewardRate. I can send at some point later up to a thousand updates to it and the tower will honor that request, that’s the promise. These are proposed by the client, the tower can accept them or reject them if it doesn’t like the terms. The BlobType is used for a couple of things. It is also included so that we can enable future modifications to the base protocol without having to rewrite everything from scratch and so we can interoperate if we continue to improve the protocol in any way. The main thing here is that it is channel agnostic. This is really important from a privacy perspective. Nothing in this policy for example says that I’m sweeping Conner’s channel in block 560,000. Because it is a function of the transaction that is broadcast I can send updates for any channel to this policy. So if I have five channels they can all use the same session and the tower doesn’t necessarily know when I send it updates am I sending it all as one channel? Or am I sending it as five? Am I using mostly one channel or maybe the other? Because of that you actually increase the level of privacy because the tower is constrained in what it learns from a… perspective to this aggregate sum of all the channels that you put into that session or all the updates you put into that session versus just knowing this update is for this channel, I can see exactly more or less when this channel was used. There has been discussion about this on the mailing list. It was appropriate early on that this was a better approach from a privacy perspective so we bit the bullet and decided we were going to go for it. I think I’m pretty happy with how this turned out in terms of an efficiency perspective. For what it’s worth I didn’t mention on the prior slide when we were talking about the justice kit, the payload itself is 274 bytes to sweep this. The plaintext payload fits in the tweets that I use to talk to you guys every day. When you add encryption and stuff like that you get to around 316 bytes. The blobs that you’re sending this tower are very small and on top of that the tower is storing just these session parameters that you see here. We’ll dive quickly into the policy.
https://github.com/lightningnetwork/lnd/blob/master/watchtower/wtpolicy/policy.go#L190
This is pretty important to understand from the perspective of how the policy fixes the function that dictates how the outputs look like. The policy has a primary function called ComputeJusticeTxOuts. Predominantly it takes the total amount and the weight of the justice transaction so far. The BlobType has a bit in it that signals whether the transaction should have a reward or not. Again this is agreed to by the tower upfront. If it does it will go ahead and compute the reward outputs which makes two outputs. One with the sweepAmt going back to the client and the reward going to the tower. If it is an altruist justice transaction where the tower is saying “I don’t want a reward, I’m just going to give you your money back minus fees.” Then there is a separate function that takes out the fees and computes a single output and sends it back. Up here there is more logic about computing the exact fee rate and sweep amounts. These are the session parameters that are saved upfront and negotiated between the client and tower. Here is where the proportional and base fees are computed. Hopefully that gives you some insight into how the session creates this function that determines the output values. We can use that as a generalized mechanism for describing justice transactions across channels with varying capacities.
Encrypted Blobs
I think some of you have probably heard earlier about the term encrypted blobs. There have been a number of variations on this idea. There has been a lot of design over this over the years but this is more or less how the derivation will work for the protocol that we’re going to propose. You start with the breach ID of the breaching transaction. Remember that these are all known upfront. In the channel I will have negotiated the state with the remote party. Once they revoke it I know the breach-txid so at that point in time I have what I need to be able to do this derivation and encrypt this backup to that tower. The hint will be… It is broken up into two parts: a hint and a key. The hint serves as a short identifier for knowing that a breach happened and detecting that a breach that happened was confirmed in a block. It is something that I as a tower have in my database. This is done by hashing the breach-txid with some 4 byte magic and taking the first 16 bytes. The second is the encryption key which is done in a similar fashion but is a 32 byte value. These are put together and we use chacha20poly1305X. Note the X which uses a slightly bigger key in nonce space. When you put these together you have a 16 byte hint along with a 314 byte encrypted ciphertext. The payload here is the serialized version of that justice kit that we saw earlier. This amounts to 233 bytes when all is said and done. For a thousand state updates you’re talking 330 kilobytes which really isn’t that bad considering that every time you upload an image to Instagram it is probably 4 to 8 maybe. The amount of space required to sweep the commitment outputs with this design is actually pretty small. Keep in mind also that the prevailing use for watchtowers will often be for mobile clients. Mobile clients will probably be making far fewer transactions than your average routing node. In addition to being somewhat efficient from a space perspective there is also the realization that mobile phones will be offline and perhaps used less. They will also consume less space from the towers.
Wire Protocol
Moving onto a simplified version of the wire protocol and how this works. We have a tower and we have our smartphone that’s running Lightning. This is actually what my phone looks like if you guys are curious. The first thing that is going to happen is the client will generate a session private key and a session public key. This is done by deriving a key from your wallet secret. It will have a derivation path that you can use. That way you don’t have to store the actual session key on disk. You can derive it every time your node starts up from your seed. The session key is used to authenticate…. it is sort of like your login really. The public key acts as the tower’s way to identify you and the session key allows you to login. All communication between the client and the tower is done via BOLT 8 which is the same protocol that is used in the Lightning Network. When two nodes have a channel operating and do gossip, all those things run over BOLT 8 which is brontide. It is like a package in lnd, that’s what we call it. For those of you who don’t know brontide is the low thundering sound that Lightning makes in the distance. Props to Laolu for that name, I think it is pretty cool. Moving on, all communications are done using this keypair. When I want to sign up for a session, I send it over the policy that I want it to use for that. If the tower accepts, it will save those parameters under my session public key. I’ll need to login at any point after and continue to update that session. The tower can reply with either an accept or a reject. If there’s a reward output, the tower will also return the reward address that should be used when constructing the justice transaction. After that step has initialized, I can send any number of state updates up to the max updates. If I requested 1000 slots I will be able to send over 1000 updates and the tower will ACK those every single time. The state update if that wasn’t clear is where all the encrypted payloads are. The hint and the encrypted blob. We send both as a pair and each sequence number is allocated linearly until the max updates is reached.
Inside a Tower
To dig a little bit more about how the inside of a tower work. Towers are actually really simple from a conceptual point. From an implementation point, not too bad. Most of the work is actually in the client. The tower at least in lnd right now, the tower package was called a standalone tower and that couples these services together. One primary service is called the Server which is responsible for talking to clients. You also have the Lookout which is responsible for detecting breaches and responding to them. The server has an exposed port. The clients can reach it the same way you would paste in a Lightning pubkey @ IP or address or whatever. The client connects in and is able to do what we saw before. It is able to create sessions, it is able to update state. The Lookout subsystem is talking to the Bitcoin network. It is listening for new blocks. Every time a new block is received, it looks through this really fancy thing called the database that is a shared communication mechanism between the Lookout and the Server. As the client send in state updates, those get written to the database. The Lookout is receiving new blocks. When it gets a new block it applies that hint transformation to all the txids in the block and then queries the database to see if we have any matches. If they do, the matches are retrieved. They are decrypted by applying that key transformation to the breach txid and then decrypting the payload. From that it is able to extract all the components to fill out the witness script, the witness arguments, signatures, stuff like that. When the… is created it returns the session parameters that are used for that update. That allows you to take these to-local or to-remote output values in the breach transaction, sum their total value and then run it through that function that generates either the altruist or the reward outputs. Then assuming all that is good you should be able to take that transaction and the watchtower should be able to validate it. It should be able to see that the scripts are correct, the signatures are valid. If all that checks out the watchtower publishes that transaction to the network and the user’s funds should be swept to either themselves or split between it and the client. That is a high level of how the watchtower works. There is a good level of optimization when you get to scale that can happen on this front that I think is pretty exciting but we won’t go into detail here.
Inside a Watchtower Client
Moving on to more of how the client itself works. This is something that will live inside lnd. Lnd will have this separate thing running off to the side which is responsible for managing watchtower sessions and backing up states. It should be pretty lightweight. The client has three main components. The Dispatcher which is the central unit allocating backup tasks to specific towers. Let’s say the client already has a session that it has negotiated with the tower. When that happens there’s an internal queue that is spawned that is responsible for accepting new things that need to be sent, encrypting them and sending them to the tower. The Dispatcher will know how full the queue is and it can say “Add five” and the Queue will take each of those, encrypt them and send them over to the tower. As channels are revoking states they are all calling into the client “I need this state backed up.” Those get funneled into the Dispatcher to which the Dispatcher looks “Ok I have an available Session Queue, I will schedule each of these and they will be subsequently forwarded to the tower.” Now the question comes in, what happens when that session runs out? When that session is exhausted we use what’s called a Session Negotiator. The Session Negotiator.. the first part of the wire protocol where I send “create session” and I get a reply. The Session Negotiator can pick from any number of towers and it will manage the Dispatcher requesting more slots and it will contact…round robin them and try to get a new one. Once that has fulfilled and we’ve received an ok from the tower, it will hand that off to the Dispatcher to allow it to keep processing more updates. Note that you can have more than one Session Queue at any given time. If I have three sessions open with three different towers, the Dispatcher can rotate across them. This can be useful for privacy because there’s let’s say I only do one session at a time. If I only do one session at a time with the same tower it doesn’t give me that much privacy. I can see that this one is exhausted and immediately after I’ll have to request a new one. If the tower is watching it can see these ones didn’t overlap and this one ended and started pretty close. I could probably deduce some correlation there. What can you do here is you can pick any number of towers to have sessions with and you can rotate between them so no one tower is seeing your exact financial transaction history. On top of that you can also have multiple sessions with the same tower. All that requires is generating a new session private key. Let’s say this one gets half emptied. I can request a new one that partially overlaps with it and shuffle between those two. That can help to delineate the history. Because the sessions are channel agnostic you’re pretty free to rotate your usage of the towers. There are probably a lot of heuristics to do so. At the moment lnd will exhaust one and request another for simplicity. Most of the logic is there to handle rotation and stuff. Just to get an initial implementation out we decided to just go with that. It probably wouldn’t be too much longer before more active heuristics are there.
Future Improvements
I think we’ll go into some future improvements for the watchtower protocol. This is an abbreviated list, there are many more. Some of the ones that I find pretty interesting. One of them is sweeping HTLC outputs. Sweeping HTLC outputs is non trivial. I could probably do a whole talk on that alone, For many many reasons which I won’t get into here today but just know that they are very hard. There are some things we can do to do them efficiently but the tradeoffs are less clear, it would probably be a discussion for the wider… Another thing that we haven’t implemented yet but will probably have to be implemented fairly soon is the concept of a Session Payment. When I request to the tower that I want to create a session. It will either accept or reject. If it accepts it will first require a small payment maybe like 100 sats, a good DOS prevention thing. I will pay 100 sats and if the payment goes through I’ll get my new session and I’ll be able to continue updating. You might be wondering how do I make a payment if I can’t backup the channel yet? Interestingly enough if you’re able to make a channel that is worth backing up aka single funded by me, you really can’t be breached because the remote party doesn’t have any funds in it. There is this bootstrapping phase where I can use my initial balance that isn’t at risk to pay for the session at which point I now have channel insurance and I can continue to update it as the balance shifts. Another one is incentivized garbage collection. With the session based approach, if you were to have one session per channel, the session can be cleaned up by the client whenever that particular channel closes. With the session based approach it is a little more complicated because you don’t want to close the session or remove the session from the tower until every channel that has ever been used has closed out onchain. That is good and bad. You get the privacy but also you might have to hold things around longer than you need to. That’s not a huge concern of mine. What is a bigger concern is making sure people are incentivized to clean up state when it becomes time. One way you can do that is when my session is exhausted or maybe when it is created I get a token from the tower, specifically like a blinded token, that I can broadcast at a later time and basically say “clean this up”. In doing so it will give me a deduction or a discount on opening another session. If I can prove that I deleted it I can get this discount on future sessions. Another one which is not implemented today but would not be too hard to implement is this ShaChain-based session attestation. Right now when I send updates to the tower I use a sequence number. The sequence number just says “Hey. I want to put something in slot 1.” The tower will say “Ok you’re good. Here’s an ACK for that, continue on.” Let’s say I have some data loss and then I come back online and I’ve only used 10 but the tower says I’ve used 500. That would be an issue because I’ve actually paid for this. The tower is unwilling to give me the slots between 10 and 500, it is now saying that I only have 500 to 1000 now. That could be an issue, I don’t foresee it being an issue just because how the trust relationship between the client and the tower will be but it is possible. One way to prevent this from a cryptographic level is to use ShaChain which is the same protocol that is used for revocation within the Lightning protocol. What this allows you to do is every time I want to backup a state, each state number has this secret I can send. If I send that to the tower and I lose data and connect later they will be able to prove that I’ve sent up to a certain number but they won’t be able to prove any further than that. That can be useful. They can lie and go backwards but at that point you’re giving me more space than I asked for. This could prevent the tower cheating people out of slots that they’d paid for. Blinded renewals, sort of related to the garbage collection. If I want to renew and I already have a session, if I had a blinded token that would do that, that would allow me to rollover subscriptions in a way that the tower can’t correlate the two but still get the discounts. This is really interesting when you have this concept of whitelisting. Let’s say I run a tower and it is public facing, it has an open port to the internet but I don’t want anyone to be able to store their data there. What you can do is you can have whitelists where I say this is the first key that I’m going to use with the tower and I give the tower that key. You’re allowed to use the session as much as you want. When that session is exhausted then I use this blinded renewal technique to get a new session or pre-authenticate the next key I want to use. Now clients are able to renew with the tower and continue to make new sessions on the premise that they had some initial whitelist that got them in. I think that’s really interesting because you can use that to create access without opening your tower up to the world. It also allows you to have some level of privacy because if the sessions are relatively small, after a number of rotations and renewals the correlation is less clear. I think that is a really interesting one and is useful in the context of private towers run by individuals or even companies. Let’s say you have an app and you want to backup to a tower. If the app whitelists you first and after that point it needs to have authentication or coordination to be able to have you renew in a way that is secure from the tower’s perspective and yours. I think it is really useful and something we should definitely continue research on. Another one is batch windowing. At the moment, lnd will send one state update, wait for an ACK, send another one, wait for an ACK. You can put those in a stream. I connect once and I send you ten but you have this explicit send ACK, send ACK, send ACK. Batch windowing will allow you to send ten at a time and then receive ten ACKs. Most of the logic is there to handle this, it is just not implemented. The protocol, in terms of the parameters on the server and the client, are both there and fully supported. Let’s say I have a really old channel that has a million state updates and I want to back it up really fast to a tower, I can use this to get more performance out of that. When moving to a situation where you didn’t have a tower before and now you need to do historical backup of all your channel states, this will allow that to be a little faster.
Want to Learn More?
If you’re interested in learning a little more the source code is available. A lot of this tower design is merged into lnd. There are a couple of open PRs and some that aren’t up yet. They will be up before the final stuff is done. Join our community Slack. We’ll probably be discussing this on the mailing list as we move towards formalizing this into a BOLT. You can find me on Twitter, I’m @bitconner and @lightning is Lightning Labs.
Q & A
Q - Will the slides be available after the talk?
A - Absolutely. I will share them and make them public.
Q - Will it be a simple process to use multiple of your own trusted devices like a cellphone, desktop, tablet, friend’s node or whatever as watchtowers for more private communications? Or is this being thought of an important part of the design? I know that it would be my first assumption for using a watchtower, sending sessions to my smartphone. Is that dangerous if you’re not running a full node or is Neutrino enough?
A - I think the bulk of the question is can you share sessions between all my devices? In theory you can. It is probably easier just to make separate ones and use different keys so any state doesn’t get mangled when you’re using different devices. In theory you could. I hope that answers the question that is being asked. There was also the question of is Neutrino enough? I assume that means on the tower side, listening to the Bitcoin P2P network and fetching blocks and scanning. Yes, Neutrino is enough. All you need to do is fetch blocks. With Neutrino you can do that. Neutrino listens to a new header and every time that happens it will fetch the block from the P2P network. That is enough to implement a tower and do the scanning.
Q - Here is a sillier question. Justin Moon wants to know how much do you deadlift?
A - Maybe 360
Q - Will it be easy to observe which channels are using watchtower services onchain? Will there be any way to mitigate this via sweeping addresses being used with some of your 2PECDSA work?
A - That’s a good question. I think the biggest hint that you’re using a tower would be traffic analysis probably. You’d be able to see state updates going to a tower. At the end of the day they are these fixed size blobs, they do give themselves away. They don’t leak how many outputs the commitment transaction has, stuff like that. I think the second question is about privacy related to 2PECDSA. In theory you can use 2PECDSA there. The biggest win there would be space efficiency. You’re already giving up that this is a channel. Once a state is broadcast unilaterally you’re exposing to the world that this is a channel, just from the scripts being used. I don’t know if it would be a huge privacy gain there. You would definitely gain something. There are places where you can use that to gain space efficiency.
Q - Can we choose our watchtowers and if so how?
A - Yes you’ll be able to choose them. The current way it is implemented in lnd is that you give it a pubkey@address and it will try to communicate with that tower. It shouldn’t be too much of a stretch. Like I said a lot of this logic is generalized in a way so that it is easy to extend it to offer a lot of features. At the moment it only supports backing up to one tower. One of the things we’re hoping to get in before launching is support for multiple towers. I’ll be able to say “Use one of these three” and it will rotate between them. A sidenote to that is “I want to use these three and backup to all of them, just not one.” What you’re saying there is I have a particular state and I want to back it up to three towers and make sure that happens. You can get even more creative. You can say “I want to use these ten towers and I want to make sure my state is backed up to at least three of them.” The watchtower game is 1 of n security. Only one has to be there to do the job. Being able to have redundant backups of these states is important. For the privacy you can rotate. That is coming.
Q - Do you foresee watchtowers being something that local communities set up to protect each other or larger centralized services?
A - I can see both really. I think there is a future where you have both. I think there reason any person would have one is due to the tradeoffs they are willing to make. If you really want more privacy and you think your small group is going to provide that, I’m sure that will happen. I will probably have one that I offer to my friends if you’re in the club! At the same time you could also see a company offering this as a service. One of the main things you need to consider when choosing a tower is that they need to be very reliable. Do you trust your small group to be more reliable than a company that has many AWS servers running for example? I don’t think any one person can answer that. It depends on your technical competence and things like that. There is a tradeoff there too. You can also use both if you want. The bigger thing for us and a lot of the thinking that went into the design and the protocol itself, let’s say there’s a worst case scenario where everyone uses one centralized service. How much privacy could we offer from this protocol in that case and start from there. Basically assume they have perfect knowledge, how much can we strip away in terms of what they can learn? If some company does decide to run one, our job in designing this protocol is that they can learn as little about their clients as possible. There are a lot of shortcuts you could take to get a watchtower service up that sacrifices all those things. In my opinion it would be a shame if that was then the protocol that was relied upon as a centralized service. If you have a protocol that gets adopted because it is simpler to implement but takes away all the privacy from their users, I think that would be a really sad thing to see. We really tried to make sure that that was going to be mitigated to the greatest extent possible, knowing people are going to make these tradeoffs.
Q - What does it take to run a watchtower? Will anybody be able to do it? You already mentioned high reliability and sufficiently competent with the code?
A - It should be pretty easy to run one. The way that it is implemented right now is that there is a standalone watchtower that will initially be run as a side companion to lnd. It will be in the same daemon and everything but it will have its own listening port and its own object within there. It should be fairly straightforward to separate that out into its own binary. Then I’ll be able to have a separate binary that is called watchtower and listens on a port and has access to the blockchain. Those can be totally isolated too. If you’re running a tower and you already have a connection to the RPC service or your Neutrino node running, then you can tack it on to the side. If you want to separate them somehow you can run it in this isolated process. It should be pretty minimal For example, you can use Neutrino as a backend to this. That will allow you to access the chain with minimal state. All you’re really using it for is header syncing and fetching blocks. You should be validating in theory so if you want to run a heavier weight one and be fully consistent you can do that. You probably do want to do that because you want to make sure the transactions you’re seeing are on the valid chain. If you had a Neutrino pointed at a full node that you also operate or someone else operates you can get security. Then your main storage cost is going to be all the state updates. If you have 330 bytes per state update, I can’t remember the math, it is something like 30K per kilobyte? Your main constraint is going to be space. At some point you may need to shard that between multiple instances which ends up being pretty easy. I don’t know if you guys are familiar with MapReduce and the way you shard keys and have the shards reconstruct them. If you have all the keys that clients input using the hints, those can be sharded out over any number of look out services. Those can independently coordinate the chain and do all the matching, stuff like that. I don’t foresee that being too much of an issue. It all depends on how much space you’re willing to commit.
Q - Do you foresee third parties building software wrappers around the BreachArbiter and is Lightning Labs planning to provide a watchtower service?
A - You wouldn’t really want to wrap the BreachArbiter itself to do a watchtower for example because the BreachArbiter assumes you have the private keys. That’s one distinction from the watchtower itself. When we started, our initial plan was to take out the BreachArbiter and use that. We didn’t get very far before we realized it doesn’t have signing keys so that’s going to make it a little more complicated. If your tower has the private keys then yeah you should do that. If it doesn’t then you’re going to need to use something different. The other question is whether Lightning Labs is going to offer a watchtower service. Yeah for the foreseeable future we’re going to offer a service so that users can connect to us. One of the things that we find really important is the ability to specify your own watchtower and use one that is not us if you choose to. We don’t want to force anybody to but if they want to we’ll be happy to serve their state updates.
Q - You mentioned how there can be multiple watchtowers. Who gets the reward if you use multiple watchtowers? Is it a winner takes all in a race scenario?
A - That’s correct. That question is really dependent on how you set up the sessions. If all of them are reward type and all of them use the same fee then in theory it is a tiebreaker. Whoever gets into the mempool first, into a miner first. If they use different fee rates then it is possible that the one with the highest fee rate is the one that is going to win out. It really depends on whether the watchtower is down or they lost your state or they weren’t being a good watchtower and they’re missing out. You never know but in general it will be the one with the highest fee rate.
Q - Does this process become anymore complicated with multiparty channels?
A - Yes in some sense. Most of that is derived from the fact that multiparty channels’ verification is very hard. I think the better multiparty channel approaches we’ve seen have come from eltoo where the combinatory blowup of revocation and penalties goes away. In that sense maybe not because each of the parties in theory could backup their state to towers on their own and fairly efficiently. I think it is a good point to discuss. How will eltoo watchtowers look for example? In theory you could make an eltoo watchtower that is constant space. Basically it backs up the latest state. It does sacrifice some amount of privacy in the sense that if I give a state to the tower and then I want to make sure it is constant space, I have to tell it which one I’m replacing. There could be some obfuscation there, having multiple copies or something. In general it induces this linear history of the state updates that I’ve sent it. You can correlate that this channel made that state update. If you were a little more privacy conscious, for example if I was going to back up my tower directly using that system it will be totally fine. I’d have no issue with that. If I’m going to trust a centralized service to maintain my state updates I don’t know if I want to give them that information and that history of the channel. In theory you can restore some privacy by using more or less the same protocol we use for revocation based channels which is the protocol we discussed today. You send an update for every revoked state and the tower stores them all. It can’t really deduce that this is the latest one, these belong to these channels. In that sense you can restore some privacy. The current protocol can be tweaked to sweep an eltoo versus a revoked transaction and more or less works identically.
Q - Will the functions of the watchtowers always be necessary in future protocol improvements? Will there be a future where watchtowers aren’t needed?
A - That’s a very good question and I can’t say for sure that we will or we won’t. I think it is pretty likely that we will need something like that. The primary reason is that when you have offchain transactions you do have some sort of history starting from the initial balance to state n and the balance fluctuates. In order to get this signed transaction offchain it needs to be a valid Bitcoin transaction that could be broadcast. The chain has no idea what state you’ve made it too. It can’t on its own know if it is a breach or not. That requires some action by the user to correct that on the chain. That is true of the current Lightning design, it is true of eltoo channels. There is some action that needs to be taken. Maybe not necessarily by any particular party but to correct a state reversion there is usually a follow up reconciliation transaction. For the foreseeable future that will probably be true. Maybe there is some crazy zero knowledge SNARK protocol that will come out that will obfuscate that. I think for the foreseeable future, yes there will probably be towers or something like it.