cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Overfull node
Date Tue, 11 May 2010 13:30:45 GMT
s/keyspace/token/ and you've got it.

On Mon, May 10, 2010 at 10:34 AM, David Koblas <koblas@extra.com> wrote:
> Sounds great, will give it a go.  However, just to make sure I understand
> getting the keyspace correct.
>
> Lets say I've got:
>    A -- Node before overfull node in keyspace order
>    O -- Overfull node
>    B -- Node after O in keyspace order
>    N -- New empty node
>
> I'm going to assume that I should make the following assignment:
>    keyspace(N) = keyspace(A) + ( keyspace(O) - keyspace(A) ) / 2
>
> Or did I miss something else about keyspace ranges?
> Thanks
>
>
> On 5/7/10 1:25 PM, Jonathan Ellis wrote:
>>
>> If you're using RackUnawareStrategy (the default replication strategy)
>> then you can "bootstrap" manually fairly easily -- copy all the data
>> (not system) sstables from an overfull machine to a new machine,
>> assign the new one a token that gives it about half of the old node's
>> range, then start it with autobootstrap OFF.  Then run cleanup on both
>> new and old nodes to remove the part of the data that belongs to the
>> other.
>>
>> The downside vs real bootstrap is you can't do this safely while
>> writes are coming in to the original node.  You can reduce your
>> read-only period by doing an intial scp, then doing a flush + rsync
>> when you're ready to take it read only.
>>
>> (https://issues.apache.org/jira/browse/CASSANDRA-579 will make this
>> problem obsolete for 0.7 but that doesn't help you on 0.6, of course.)
>>
>> On Fri, May 7, 2010 at 2:08 PM, David Koblas<koblas@extra.com>  wrote:
>>
>>>
>>> I've got two (out of five) nodes on my cassandra ring that somehow got
>>> too
>>> full (e.g. over 60% disk space utilization).  I've now gotten a few new
>>> machines added to the ring, but evertime one of the overfull nodes
>>> attempts
>>> to stream its data it runs out of diskspace...  I've tried half a dozen
>>> different bad ideas of how to get things moving along a bit smoother, but
>>> am
>>> at a total loss at this point.
>>>
>>> Is there any good tricks to get cassandra to not need 2x the disk space
>>> to
>>> stream out, or is something else potentially going on that's causing me
>>> problems?
>>>
>>> Thanks,
>>>
>>>
>>
>>
>>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message