cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: problem about bootstrapping when used in huge node
Date Tue, 23 Feb 2010 13:31:46 GMT
On Tue, Feb 23, 2010 at 12:33 AM, Michael Lee
<mail.list.steel.mental@gmail.com> wrote:
> (1)     A cluster cannot be enlarge(add more node into cluster) if it
> already used more than half capacity:
>
> If every node has data more than it’s half capacity , the admin may not
> bootstrapping new node into cluster,
>
> because old nodes must strip data belong to new node through
> anti-compaction, the process will create a large tmp SSTable
> file (for streaming), which may large than free disk space ( of one node ).

That's right, in the worst case.  On average, any node sending to a
bootstrap node will only have to anti-compact half its data.

We have https://issues.apache.org/jira/browse/CASSANDRA-579 open to
allow streaming data w/o first writing it locally.

> (1)     Is cassandra designed to waste half of it’s capacity?

Yes, although I might describe it as "cassandra requires up to half
its capacity as temporary space for compaction and anticompaction."
http://wiki.apache.org/cassandra/MemtableSSTable

That's the price you pay for no random writes.

> (2)     How to use node has 12 1TB disk??

You should use a better filesystem than ext3. :)  We use xfs at rackspace.

-Jonathan

Mime
View raw message