Scalability

14 messages BitcoinTalk Satoshi Nakamoto, jib, TopSoil, knightmb, Insti, Gavin Andresen, spaceshaker, DataWraith, BitLex, InterArmaEnimSil June 18, 2010 — July 14, 2010

Satoshi Nakamoto June 18, 2010 Source · Permalink

The current system where every user is a network node is not the intended configuration for large scale. That would be like every Usenet user runs their own NNTP server. The design supports letting users just be users. The more burden it becomes to run a node, the fewer nodes there will be. Those few nodes will be big server farms. The rest will be client nodes that only do transactions and don’t generate.

A block header with no transactions would be about 80 bytes. If we suppose blocks are generated every 10 minutes, 80 bytes * 6 * 24 * 365 = 4.2MB per year. With computer systems typically selling with 2GB of RAM as of 2008, and Moore’s law predicting current growth of 1.2GB per year, storage should not be a problem even if the block headers must be kept in memory.

jib July 12, 2010 Source · Permalink

Am I correct in understanding that every node receives information about every transaction (as the technical paper says)? Doesn’t that make bitcoin completely impractical for use as a currency on a large scale?

TopSoil July 13, 2010 Source · Permalink

Well The economic discussion probably belongs in a different thread so I’ll leave it at that. Oh I went to make a new thread but this guy has made my point better: topic 57

knightmb: the number of coins per block is set to go down as time goes on so the number ends up being quite a bit larger than that. Also you have to include the transactions. There have been way less transactions during these first 67k blocks than there will be if millions of people start using the system.

Can someone that works on the code please answer this question of how scaling will be dealt with? The most important thing right now seems to be making a good case for why this system will work to get sellers to adopt it. The pdf is too light on details.

knightmb July 13, 2010 Source · Permalink

Quote from: TopSoil on July 13, 2010, 09:37:33 PM

Well The economic discussion probably belongs in a different thread so I’ll leave it at that. Oh I went to make a new thread but this guy has made my point better: topic 57

knightmb: the number of coins per block is set to go down as time goes on so the number ends up being quite a bit larger than that. Also you have to include the transactions. There have been way less transactions during these first 67k blocks than there will be if millions of people start using the system.

Can someone that works on the code please answer this question of how scaling will be dealt with? The most important thing right now seems to be making a good case for why this system will work to get sellers to adopt it. The pdf is too light on details.

Thanks for pointing that out, I wasn’t sure how that would scale over time, was just making some guess. Given that the formula for coin generation should be known somewhere, can’t someone just calculate how much disk space X amount of coins will take given XYZ transactions, etc.

I’m curious myself to how much space it will take. Smiley

Insti July 13, 2010 Source · Permalink

Quote from: knightmb on July 13, 2010, 10:08:58 PM

Given that the formula for coin generation should be known somewhere, can’t someone just calculate how much disk space X amount of coins will take given XYZ transactions, etc. I’m curious myself to how much space it will take.

From the pdf: “A block header with no transactions would be about 80 bytes.” and “Once the latest transaction in a coin is buried under enough blocks, the spent transactions before it can be discarded to save disk space.”

so 80 x number of blocks + average transaction size * number of transactions.

Practically, from my disk: 77428 transactions in 66663 blocks is about 46,752,464 bytes. which works out to about 600 bytes per transaction (including block headers + database overheads)

Gavin Andresen July 14, 2010 Source · Permalink

Quote from: Insti on July 13, 2010, 11:34:03 PM

77428 transactions in 66663 blocks is about 46,752,464 bytes. which works out to about 600 bytes per transaction (including block headers + database overheads)

That sounds about right.

So a million transactions a day would be 600 million bytes. 600 megabytes a day, 18 GB a month.

That’s not bad. Actual network bandwidth will be higher (the way the network is connected you get the same transaction multiple times from your peers). You won’t be running an always-connected-network node on your iPhone, but any low-cost server will give you twenty times that bandwidth per month. And 18GB isn’t much disk space in these days of terabyte hard drives.

A million transactions per day is a LOT! For comparison, in 2006 there were about 60 million credit card transactions per day in the US.

Eventually, if Bitcoin survives and gets as popular as credit cards for paying for stuff I expect somebody will create a compatible version with a more efficient network structure (maybe by that time there will be some fancy IPV6 multicast protocol or something). And they’ll implement a couple of gateway nodes (running on really fast connections) that shuttle transaction and block traffic from the current Bitcoin network into the super-efficient network. And I expect most of us will be running lightweight clients that just keep our wallets, sign transactions, and send and receive transactions to the ultra-fast nodes that ARE looking at every transaction.

You know, kind of like how we have those Big Routers in the Sky that handle Internet backbone traffic (or the ultra-fast DNS root servers). The Internet didn’t start out with astoundingly fast routers zinging packets around.

spaceshaker July 14, 2010 Source · Permalink

Quote from: gavinandresen on July 14, 2010, 12:42:32 AM

And I expect most of us will be running lightweight clients that just keep our wallets, sign transactions, and send and receive transactions to the ultra-fast nodes that ARE looking at every transaction.

Is this possible? What would this look like? From a technical perspective what does a “lightweight client” look like for you? My understanding is that the Bitcoin client needs the entire block chain in order to establish trust.

I am just thinking out loud here…

Although the peer-to-peer model is certainly novel perhaps it seems to me to be somewhat utopian. Bear with me for a minute here (I am not trying to troll). Consider banks. Banks have a system whereby they can work together efficiently. I take take money out of a ATM from bank X even though I bank with bank Y. Banks loan money to each other. They are generally cooperative. Instead of every Tom, Dick & Harry having a Bitcoin client on his/her PC (or smartphone) participating in an open P2P network, perhaps there is a collection of Bitcoin “banks” who provide the service of hosting and “peering” the Bitcoin block chain. These are large enough organizations that they can afford the bandwidth and hardware needed to maintain an infinitely long block chain with a million (or more) transactions a day. These banks would still be peer-2-peer, and hopefully also completely open. Ideally anybody could participate in the peer-2-peer network, its just that the average person won’t because of the barrier to entry. These banks still operate using the same fundamental technology we have to day. All of the beautiful facets of Bitcoin are preserved, except that the number of active participants is somewhat reduced. Anybody that wanted to participate still could.

The problem remaining would be the typical “last mile” problem. How does Tom (or Dick or Harry) perform transactions? Well the issue becomes much more straight forward at this point. Now the trust only has to be between two parties (the “bank” and Tom). This really becomes more of a proxy issue. Now Tom has to send a transaction request through his “bank”. It might even be possible to bake into Bitcoin a protocol for proxy transactions.

Anyway…This is just my 2 cents. I would really like a tangible answer to this problem because it seems foundational to the success of this endeavor to me.

Gavin Andresen July 14, 2010 Source · Permalink

Quote from: spaceshaker on July 14, 2010, 01:52:00 AM

Quote from: gavinandresen on July 14, 2010, 12:42:32 AM

And I expect most of us will be running lightweight clients that just keep our wallets, sign transactions, and send and receive transactions to the ultra-fast nodes that ARE looking at every transaction.

Is this possible? What would this look like? From a technical perspective what does a “lightweight client” look like for you? My understanding is that the Bitcoin client needs the entire block chain in order to establish trust.

I’m imagining:

A lightweight client would have a wallet with coins in it (public+private key pairs).

And a secure way of sending messages to, and getting messages from, any of the ultra-fast, always-connected heavyweight nodes.

The lightweight client sends money by: creating a transaction (signing coins with the private key) sending the signed transaction securely to the ultra-fast server, which puts it on the network. receiving confirmation that the transaction was valid and sent, and updating its wallet (marks coins as spent) (or getting a “you already spent those coins” error from the server)

The lightweight client receives money by: Either polling the server every once in a while, asking “Any payments to these BC addresses that I have in my wallet?” … or asking the server to tell it whenever it sees a transaction to a list of BC addresses (or maybe when it sees a relevant transaction with N confirmations) When transactions occur, the lightweight client updates its wallet (adds the coins).

You don’t have to trust the server; it never has your private keys.

Well, you do have to trust that the server doesn’t lie about whether your transactions are valid or not, but why would the server lie about that?

TopSoil July 14, 2010 Source · Permalink

Sure it can work that way but is that the ideal? Doesn’t that make the network less robust and more vulnerable to attacks and manipulation? What happens if some attackers start running a cluster of supernodes? The main point is why rely on this more vulnerable architecture when you don’t have to? It isn’t easier to implement.

Yeah and I would argue neither of those systems are very well designed. Limewire replaced the gnutella routing with Kademlia actually.

spaceshaker: yes I agree there will have to be nodes that act as proxies for mobile devices.

DataWraith July 14, 2010 Source · Permalink

Quote from: TopSoil on July 14, 2010, 03:59:18 PM

QuoteWhy? E-Mail and Jabber work the same way. Everyone can run their own server, and many do, but most users prefer to use someone else’s server(s).Sure it can work that way but is that the ideal? Doesn’t that make the network less robust and more vulnerable to attacks and manipulation?

Not really. If you’re worried, you can always run your very own server.

Nothing. Unless you happen to use one of those supernodes, which you don’t have to, because you can run your own supernode. Or use a trusted one, much like people trust Google with their E-Mail.

It isn’t easier to implement? I’d like to see some proof here.

Counterexample: The totally distributed E-Mail system developed at Rice University is a hell of a lot more complex than running a (network of) semi-centralized mail server(s).

Um. That’s exactly what a supernode server would do.

spaceshaker July 14, 2010 Source · Permalink

Quote from: DataWraith on July 14, 2010, 04:42:16 PM

Quotespaceshaker: yes I agree there will have to be nodes that act as proxies for mobile devices.

Um. That’s exactly what a supernode server would do.

Um. Sure. I think I’ve gone full circle. I think Gavin said it best:

Quote from: gavinandresen on July 14, 2010, 02:20:45 AM

A lightweight client would have a wallet with coins in it (public+private key pairs).

And a secure way of sending messages to, and getting messages from, any of the ultra-fast, always-connected heavyweight nodes.

The lightweight client sends money by: creating a transaction (signing coins with the private key) sending the signed transaction securely to the ultra-fast server, which puts it on the network. receiving confirmation that the transaction was valid and sent, and updating its wallet (marks coins as spent) (or getting a “you already spent those coins” error from the server)

The lightweight client receives money by: Either polling the server every once in a while, asking “Any payments to these BC addresses that I have in my wallet?” … or asking the server to tell it whenever it sees a transaction to a list of BC addresses (or maybe when it sees a relevant transaction with N confirmations) When transactions occur, the lightweight client updates its wallet (adds the coins).

You don’t have to trust the server; it never has your private keys.

Well, you do have to trust that the server doesn’t lie about whether your transactions are valid or not, but why would the server lie about that?

In this scenario, the Bitcoin client could remain largely the same as it is today, although the focus would be that it is used on the “super-nodes” or “transaction servers” or “proxy servers” (these systems would probably serve all three roles) or by anyone wishing to play in that game. If the Bitcoin client was augmented to use DHT then that may be improvement but there is still a need for a “lightweight client” as Gavin described above. It seem’s Gavin’s “lightweight client” concept obviates my scalability concerns somewhat.

BitLex July 14, 2010 Source · Permalink

Quote from: gavinandresen on July 14, 2010, 02:20:45 AM

Quote from: spaceshaker on July 14, 2010, 01:52:00 AM

Quote from: gavinandresen on July 14, 2010, 12:42:32 AM

And I expect most of us will be running lightweight clients that just keep our wallets, sign transactions, and send and receive transactions to the ultra-fast nodes that ARE looking at every transaction.

Is this possible? What would this look like? From a technical perspective what does a “lightweight client” look like for you? My understanding is that the Bitcoin client needs the entire block chain in order to establish trust.

I’m imagining: …

you don’t even have to imagine , actually it’s already possible to “remote-control” the node, so just create ur own little lightweight-client, that just sends and gets some info to/from your (highspeed-connected, hdd-packed) homeserver.

some kind of “managed server-client-version” already exists in MyBitcoin, you could run something like that on your own webhost and connect to it from wherever u are on whatever connection-speed.

the “lightweight client” doesnt have to be one, but connect to it and tell it what todo. we can do that with JSON.

InterArmaEnimSil July 14, 2010 Source · Permalink

I second the DHT idea for maintaining a client list - we can’t have millions of people relying upon an IRC channel, etc. As far as the scaling issue goes, the issue is not at all HDD space, its network bandwidth. Everyone is forgetting, its not bytes_per_transaction*transactions, which is the number everyone is using. That number, as everyone has said, is fully manageable. No, the number we’re interested in is bytes_per_transaction * transactions * number_of_clients * total_hops_beyond_first_between_all_clients_combined

THIS is the amount of bandwidth which the protocol for BTC consumes as the network scales. We’re not just talking about sending one copy of each transaction to each client - we’re talking about multiple clients broadcasting potentially redundant data to one another, and doing it across numerous hops, meaning numerous rebroadcasts. Much larger number, much more difficult to handle. However, it is manageable, just not in the current incarnation of network handling in the client.

Perhaps in the “popular” phase, BTC chains could be broken up by region, similar to the purviews of domain name authorities now - and there could be an alternative protocol for transactions across these regional boundaries? This would help the raw numbers of the problem, and also cut down on latency and related issues. Not that I think this is an excellent solution - but P2P flooding across all active clients is obviously out barring some massive breakthrough in quantum computing or whatnot.

Satoshi Nakamoto July 14, 2010 Source · Permalink

The design outlines a lightweight client that does not need the full block chain. In the design PDF it’s called Simplified Payment Verification. The lightweight client can send and receive transactions, it just can’t generate blocks. It does not need to trust a node to verify payments, it can still verify them itself.

The lightweight client is not implemented yet, but the plan is to implement it when it’s needed. For now, everyone just runs a full network node.

I anticipate there will never be more than 100K nodes, probably less. It will reach an equilibrium where it’s not worth it for more nodes to join in. The rest will be lightweight clients, which could be millions.

At equilibrium size, many nodes will be server farms with one or two network nodes that feed the rest of the farm over a LAN.