cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thibaut Britz <>
Subject Re: Fill disks more than 50%
Date Thu, 24 Feb 2011 09:08:39 GMT

How would you use rsync instead of repair in case of a node failure?

Rsync all files from the data directories from the adjacant nodes
(which are part of the quorum group) and then run a compactation which
will? remove all the unneeded keys?


On Thu, Feb 24, 2011 at 4:22 AM, Edward Capriolo <> wrote:
> On Wed, Feb 23, 2011 at 9:39 PM, Terje Marthinussen
> <> wrote:
>> Hi,
>> Given that you have have always increasing key values (timestamps) and never
>> delete and hardly ever overwrite data.
>> If you want to minimize work on rebalancing and statically assign (new)
>> token ranges to new nodes as you add them so they always get the latest
>> data....
>> Lets say you add a new node each year to handle next years data.
>> In a scenario like this, could you with 0.7 be able to safely fill disks
>> significantly more than 50% and still manage things like repair/recovery of
>> faulty nodes?
>> Regards,
>> Terje
> Since all your data for a day/month/year would sit on the same server.
> Meaning all your servers with old data would be idle and your servers
> with current data would be very busy. This is probably not a good way
> to go.
> There is a ticket open for 0.8 for efficient node moves joins. It is
> already a lot better in 0.7. Pretend you did not see this (you can
> join nodes using rsync if you know some tricks) if you are really
> afraid of joins, which you really should not be.
> As for the 50% statement. In a worse case scenario a major compaction
> will require double the disk size of your column family. So if you
> have more then 1 column family you do NOT need 50% overhead.

View raw message