25-second irreversible confirmations for instant payments

Recently there has been a race towards lower blockferrari_california_new intervals for PoW block-chain based  cryptocurrencies. First there was Bitcoin with a 10 minute interval, then was LiteCoin using a 2.5 interval, then was DogeCoin with 1 minute, and then QuarkCoin with just 30 seconds.

This reminds me a joke in a comedy movie where there were two guys in a car, and one says that he had a fantastic idea for a top-selling product: a video called “Get fit by exercising 10 minutes a day”. The other replies that there is already a video teaching how to get fit by exercising 15 minutes a day. The first explains that that’s why his own video will be a top-seller: you get the same result in less exercise time. After thinking about it a minute, the second claims he has a tremendous idea, a video teaching to get fit by exercising 5 minutes a day. But the first one then shouts terribly outraged: that’s impossible! [1] This seems similar with what is happening with confirmation intervals: every new cryptocurrency lowers it a little bit, but no one actually knows what are the implications of doing so. To understand how the block interval impacts the stability and capability of the cryptocurrency network, several factors must be taken into account. First of all, the most important factor that affects the viability of short confirmation intervals is the number of stale blocks generated. Two factors mainly affect the stale block rate: the propagation protocol and the block propagation time from the top miners to the top miners.  We’ll analyze in the next section these factors and some measures that can be taken to optimize them.

Factors affecting the stale block rate

The block propagation protocol: Bitcoin forwards a block by packing the block header with all the transactions contained in the block. This strategy, while being the most easy to analyze, is known to perform badly, both regarding block propagation latency and bandwidth usage. Since the transactions on a block are generally already known to the network, there is no benefit in transmitting them again. It’s possible that a peer reconstructs a block by being sent only the header and the list of transactions hashes. The peer then decides which transactions missing in the transaction pool need to be fetched. But since each node stores the hashes of the transactions advertised by its peers, an even better strategy is that a miner also sends immediately the transactions included in the block that he knows are missing in the peer’s pool. This eliminates completely the need of a second interaction to request additional transactions. Still another optimization is that miners only include transactions that have been received before a few of seconds ago. This assures with high probability that transactions will have already been received by peers before the block is mined. Still another improvement is to forward the block header before even checking the transactions, and only checking the block PoW and height at forward time. This allows the header to spread over the network in less than a second. Nodes can then start mining an empty block on top of the header even if the transactions are still missing, during a fixed interval. After that interval, they resume mining in whatever block they were mining before. These empty blocks reduce the effective bandwidth and block-chain storage usage. But this is generally unimportant since empty block are, by definition, small.  But if the block interval is reduced beyond 1 second, then the time a transaction is included in a block begins to increase, since most of the blocks produced will be empty. Some parts of this protocol, and how to modify Bitcoin to support them, was proposed here.

The type of nodes in the network: Because what matters most is inter-miner connection latency, having other kind of nodes in the network (such as users,  or activity monitors) only increase the number of stale blocks. A cryptocurrency network could be optimized by forcing that all participants to be miners, or prioritizing miners over other nodes. This can be done, as in mining pools, by incetivizing nodes to broadcast quasi-solved block headers (blocks with a PoW that is lower than the target difficulty, but still are difficult to find). These quasi-blocks can be used by nodes to increase the priority of peers that send the highest number of quasi-solved blocks.

The block reward scheme: If the block reward decreases over time until zero (as in Bitcoin), then the miner’s revenue will, on the long run, depend solely on transaction fees. This means that creating a block with no transactions is useless for the miner. But as previously stated, maximally reducing the block interval comes with a price: empty blocks may need to be created for headers with still unknown transactions. That’s why a cryptocurrency which maintains a reward forever on blocks can achieve a lower stale block rate.

The incentivized topology of the miners network: a highly concentrated network, with big mining pools, tend to generate much less state blocks than a complete distributed mining topology. In this regard cryptocoins based on SHA-256 PoW have an advantage over non ASIC-friendly PoW based crytopcoins, since mining requires an investment in ASIC boards.

The real topology of the network: Bitcoin design assumes the network is similar to a random graph, having a certain average out-degree and in-degree. While this is far from true in reality,  network nodes take local decisions to avoid forming geographical clusters (at least for the out-bound connections). This is not the best topology to help block propagation. The best topology for block propagation is one that serves the top miners better, by encouraging direct connections between them or by routing blocks faster between them. Also a direct miner-to-miner backbone can help to decrease notably the number of stale blocks. This has been proposed for Bitcoin to increase resilience from attacks.

The PoW function verification time: SHA-256 is  very fast to evaluate and so the Bitcoin PoW verification time is negligible. An scrypt PoW, on the contrary, may take from 3 to 30 milliseconds to evaluate depending on the parameters chosen (GPU or ASIC prevention). To protect the network from spamming and DoS attacks, each node needs to verify the block PoW before forwarding the block again, so the verification delay gets multiplied by the number of hops in the block path between miners. It’s nevertheless possible to forward the block without checking the block PoW if the block (or the message containing the block) carries a small fast-to-verify PoW, which is verified by nodes and “pays” for the forwarding. Then, a scrypt-based coin can also benefit from the fastest possible forwarding protocols.

Client networking stack: Once a node receives a block, the best it can do to reduce the appearance of stale blocks is forward it as soon as possible. This means that all other node activity should be paused or stopped. Some operations can be designed to allow immediate cancellation and accept retrials. To allow immediate forwarding, the client networking stack must not block the client in transaction verification procedures or other housekeeping activities, such as chain reorganizations. This can be best achieved by a multi-threading client, and dynamically setting thread priorities to boost the thread that has received the block. Even a better performance could be achieved by using a kernel level device driver that takes care of block header packet forwarding even before the application receives the notification.

The block overhead: Block headers in most cryptocurrencies are small (80 bytes) so the header size (over the block size) does not pose a significant overhead. But the block must also contain the generation transaction, which may account for other 250 bytes. Anyway, the block overhead generally does not pose a problem. Nevertheless, when block interval approaches  sub-second times, it reduces the channel bandwidth considerably.

Since some factors affect other factors, it’s difficult to construct a model that allows to determine the transaction confirmation time with a closed-formula. Instead, I attempted to build a network simulator to play with different factors and algorithms.

Simulating Block Propagation

There are two strategies for simulation block propagation using a discrete event simulation. A first strategy is to simulate the whole network, node by node, assuming certain graph topology. This is challenging, because the topology of the network is unknown.  Many nodes do not accept incoming connections so they cannot be probed and are invisible to network crawlers. Some sites like bitnodes  attempt to measure the network, but their results are only an approximation. Also, simulating 60K nodes for several days can be very CPU consuming. A second strategy is to simulate the interaction between two top-miners, each one in another end of the graph (the hop distance between them being near the average distance between nodes). This is not the worse case, where miners are located a hop-distance equal to the diameter of the graph, but since its the best interest for top-miners to be well-connected, we can assume they perform not worse than the average. This turns to be very easy to simulate, the only simulated events being the creation of a block in one of the two locations and the propagation of the block to the opposite location. Each location stores a best chain and a competing chain. The following results show the simulation of Bitcoin, LiteCoin, and FastCoin5. FastCoin5 simulation represents the use of the best parameters to reduce stale blocks for a 5 block interval.

The chosen optimizations for FastCoin5 are:

– Block push heuristic, SHA.256 based PoW.

– Transmission of blocks with transaction hash list, but not full transactions. Pushing missing transaction in a block to peers.

– Mining of empty blocks while parent block transactions are still being fetched (with a maximum wait of 10 seconds)

– Node route prioritizing. Nodes stop whatever they are doing when they receive a block to forward it as quickly as possible.

– Multi-threaded, non-locking client networking stack.

Simulation Results

------------------------------------------------------------------
Bitcoin
------------------------------------------------------------------
Normal propagation
Number of Events: 200001
Number of blocks: 100000
Interval: 600.00
Switches: 2101
Stale Blocks: 2152 ( 2.15%)
Stale Blocks/day: 3.10
Empty Blocks: 4422
Switches/day: 3.03

2: Number of switches/day of length 2: 2.95( 2.05%)
3: Number of switches/day of length 3: 0.07( 0.05%)
4: Number of switches/day of length 4: 0.00( 0.00%)

2: blocks lost in switches of length 2 a day: 2.95( 2.05%, avg conf time: 1500.0)
3: blocks lost in switches of length 3 a day: 0.14( 0.09%, avg conf time: 2100.0)
4: blocks lost in switches of length 4 a day: 0.01( 0.01%, avg conf time: 2700.0)

Empty Blocks/day: 6.37 of 144.00( 4.4%)
Channel usage [because of empty blocks]: 95.58%
Additional delay because of empty blocks: 27.76 secs
Transactions Per Block: 1200.00
Channel usage (header/tx list overhead): 99.98%
Block Size: 515.63 Kbytes
Time To Propagate Block: 27.00
Sim Time : 694.44 days

------------------------------------------------------------------
Litecoin
------------------------------------------------------------------
Normal propagation
Number of Events: 200001
Number of blocks: 100000
Interval: 150.00
Switches: 2986
Stale Blocks: 3098 ( 3.10%)
Stale Blocks/day: 17.84
Empty Blocks: 6217
Switches/day: 17.20

2: Number of switches/day of length 2: 16.58( 2.88%)
3: Number of switches/day of length 3: 0.59( 0.10%)
4: Number of switches/day of length 4: 0.03( 0.01%)

2: blocks lost in switches of length 2 a day: 16.58( 2.88%, avg conf time: 375.0)
3: blocks lost in switches of length 3 a day: 1.18( 0.20%, avg conf time: 525.0)
4: blocks lost in switches of length 4 a day: 0.09( 0.02%, avg conf time: 675.0)

Empty Blocks/day: 35.81 of 576.00( 6.2%)
Channel usage [because of empty blocks]: 93.78%
Additional delay because of empty blocks: 9.94 secs
Transactions Per Block: 300.00
Channel usage (header/tx list overhead): 99.94%
Block Size: 128.91 Kbytes
Time To Propagate Block: 9.85
Sim Time : 173.61 days

------------------------------------------------------------------
FastCoin5
------------------------------------------------------------------

Number of Events: 200001
Number of blocks: 100000
Interval:  5.00
Switches: 5679
Stale Blocks: 6058  ( 6.06%)
Stale Blocks/day: 1046.82
Empty Blocks: 37581
Switches/day: 981.33

2: Number of switches/day of length 2: 918.09( 5.31%)
3: Number of switches/day of length 3: 61.17( 0.35%)
4: Number of switches/day of length 4:  1.90( 0.01%)
5: Number of switches/day of length 5:  0.17( 0.00%)

2: blocks lost in switches of length 2 a day: 918.09( 5.31%, avg conf time:  12.5)
3: blocks lost in switches of length 3 a day: 122.34( 0.71%, avg conf time:  17.5)
4: blocks lost in switches of length 4 a day:  5.70( 0.03%, avg conf time:  22.5)
5: blocks lost in switches of length 5 a day:  0.69( 0.00%, avg conf time:  27.5)

Empty Blocks/day: 6494.00 of 17280.00(37.6%)
Channel usage [because of empty blocks]: 62.42%
Additional delay because of empty blocks:  3.01 secs
Transactions Per Block: 10.00
Channel usage (header/tx list overhead): 98.21%
Block Size: 4.30 Kbytes
Time To Propagate Block:  2.34
Sim Time :  5.79 days

The fields “avg conf time” estimate the confirmation time in seconds for the reversal probability shown just before the field, as a percentage (total blocks lost multiplied by 100 over total blocks created). The average time to confirm a block with a 0.1% probability of being reverted because of a chain reorganization is, for each system: 25 minutes for Bitcoin, 11.25 minutes for LiteCoin and 25 seconds for FastCoin5. The only drawback of FastCoin5 seems to be that the channel usage due to empty blocks is only 62.4%, but since transactions are transmitted only once instead of twice as in Bitcoin, FastCoin5 effectively achieves a higher channel utilization than Bitcoin (currently below 50%).

It must be noted that the client application need not to internally switch the best chain before 3 confirmations have been received for any candidate chain, since the user should not be notified if  a transaction has less than 3 confirmations. This eliminates all computing bottle-neck in normal clients. Miners, on the contrary, must verify the transactions of each block to decide which chain to mine on top of.

It’s interesting that a merchant may target a different number of confirmation blocks for services or products that are differently priced. For example, if the merchant is willing to accept to loose 1 USD/day and prefers smaller confirmation times, and sells 100 products a day which are priced under 1 USD each, he can choose to accept only 3 confirmation blocks for these products, to achieve a 1% reversal probability, which leads to a 20 second confirmation time. It would be interesting to implement a client that can show the confirmation status depending on the transaction value, and a pre-defined minimal expected loss.

It must be taken into account that reversal probabilities shown are based on the fact that miners will forget to include the transactions taken from the stale blocks in the newly created blocks, and that competing blocks transaction sets do not intersect. This is a simulation defect that increases the confirmation times. This is not the case in the Satoshi client, where all transactions that belong to the stale blocks are re-inserted into the transaction pool, and competing blocks share most transactions. A transaction with enough fees may have actually a reversal probability much higher than the value presented in this post, and the FastConf5 confirmation time could actually be much lower.

Conclusions

In this post we’ve analyzed the factors that influence the stale block ratio and the expected confirmation time for a certain confirmation-reversal probability.  A simulation was carried to test for different protocols and hypothesis. After tunning it, we’ve proposed a crupto-ocurrency design where the block interval is 5 seconds and transaction confirmation (for a reversal probability of 0.1%) is below 25 seconds.

This design, along with several other improvements, is being tested in the NimbleCoin cryptocurrency.

[1] Since I did’t remebered which film was it, I reconstructed the dialog pretty badly. Happyly a reader sent me the film name and a link to the film scene in youtube. The film is no other than the famous “Theres Something About Mary”. Here it is the hilarious scene: http://www.youtube.com/watch?v=byEkJ3zRTcY&feature=youtu.be

, , , , , ,

  1. #1 by x on February 17, 2014 - 8:03 pm

    something about mary – 7 minute abs

    • #2 by SDLerner on February 17, 2014 - 8:16 pm

      Thanks! I tried hard to remember….

  2. #3 by I could say I'm Peter Todd here... on August 10, 2014 - 7:09 pm

    Something to keep in mind is that the network distributing blocks != miners distributing blocks. Large miners and mining pools can and do peer to each other directly, so propagation delays on the global P2P bitcoin network don’t affect them the same way as smaller miners who are not in a position to do that. When the block “interval” is getting down to just 12 seconds, even stuff like physically locating your mining pool hashing power closer to other concentrations of hashing power may net you more revenue, with obvious problems for decentralization. You might have a small mining setup, and want to contribute that decentralized hashing power, but find you can’t profitably because all the big mining pools already have peering agreements with each other and dedicated high-speed connections.

    • #4 by SDLerner on August 11, 2014 - 1:33 pm

      In this FASTCOIN5/NimbleCoin proposal I’m doing header-first propagation. This means that a very small message is propagated (pushed) without round-trip times. This message floods the network very quickly (this header is between 80 and 800 bytes, depending if merged-mining is done or not). The 12-seconds you mention are true for Bitcoin, but when you use header-first propagation, then the limit much lower (probably less than 5 seconds). Also NimbleCoin miners can mine on top of incomplete blocks (where all transactions have not been received yet) for upto 5 seconds.

Leave a comment