incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rodrigo Felix <rodrigofelixdealme...@gmail.com>
Subject Re: General doubts about bootstrap
Date Wed, 10 Jul 2013 18:23:27 GMT
Currently, I'm using cassandra 1.1.5, but I'm considering to update to
1.2.x in order to make use of vnodes.
Doubling the size is not possible to me because I want to measure the
response while adding (or removing) single nodes.
Thank you guys. It help me a lot to understand better how cassandra works.

Att.

*Rodrigo Felix de Almeida*
LSBD - Universidade Federal do CearĂ¡
Project Manager
MBA, CSM, CSPO, SCJP


On Wed, Jul 10, 2013 at 11:11 AM, Eric Stevens <mightye@gmail.com> wrote:

> > => Adding a new node between other nodes would avoid running move, but
> the ring would be unbalanced, right? Would this imply in having a node
> (with bigger range, 1/2 of the range while other 2 nodes have 1/2 each,
> supposing 3 nodes) overloaded? I'm refering
> http://wiki.apache.org/cassandra/Operations#Load_balancing
>>
>>
>>>
>>> Yes, if you're using a single vnode per server, or are running an older
> version of Cassandra.  For lowest impact, doubling the size of your cluster
> is recommended so that you can avoid doing moves.  Or if you're on
> Cassandra 1.2+, you can use vnodes, and you should not typically need to
> rebalance after bringing a new server online.
>
>
> On Tue, Jul 9, 2013 at 9:31 PM, Rodrigo Felix <
> rodrigofelixdealmeida@gmail.com> wrote:
>
>> Thank you very much for you response. Follows my comments about your
>> email.
>>
>> Att.
>>
>> *Rodrigo Felix de Almeida*
>> LSBD - Universidade Federal do CearĂ¡
>> Project Manager
>> MBA, CSM, CSPO, SCJP
>>
>>
>> On Mon, Jul 8, 2013 at 6:05 PM, Robert Coli <rcoli@eventbrite.com> wrote:
>>
>>> On Sat, Jul 6, 2013 at 1:50 PM, Rodrigo Felix <
>>> rodrigofelixdealmeida@gmail.com> wrote:
>>>
>>>>
>>>>    - Is it normal to take about 9 minutes to add a new node? Follows
>>>>    the log generated by a script to add a new node.
>>>>
>>>> Sure.  => OK
>>>
>>>>
>>>>    - Is there a way to reduce the time to start cassandra?
>>>>
>>>> Not usually. => OK
>>>
>>>>
>>>>    - Sometimes cleanup operation takes make minutes (about 10). Is
>>>>    this normal since the amount of data is small (1.7gb at maximum / seed)?
>>>>
>>>> Compaction is throttled, and cleanup is a type of compaction. Bootstrap
>>> is also throttled via the streaming throttle. => OK
>>>
>>>>
>>>>    - Considering that I have two seeds in the beginning, their tokens
>>>>    are 0 and 85070591730234615865843651857942052864. When I add a new machine,
>>>>    do I need to execute move and cleanup on both seeds? Nowadays, I'm running
>>>>    cleanup on seed 0, move + cleanup on the other seed and neither move nor
>>>>    cleanup on the just added node. Is this OK?
>>>>
>>>> Only nodes which have "lost" ranges need to run cleanup. In general you
>>> should add new nodes "between" other nodes such that "move" is not required
>>> at all.
>>>
>>
>> => Adding a new node between other nodes would avoid running move, but
>> the ring would be unbalanced, right? Would this imply in having a node
>> (with bigger range, 1/2 of the range while other 2 nodes have 1/2 each,
>> supposing 3 nodes) overloaded? I'm refering
>> http://wiki.apache.org/cassandra/Operations#Load_balancing
>>
>>>
>>>>    - What if I do not run cleanup in any existing node when adding or
>>>>    removing a node? Is the data that was not "cleaned up" still available
if I
>>>>    send a scan, for instance, and the scan range is still in the node but
it
>>>>    wouldn't be there if I had run cleanup? Data would be gather from other
>>>>    node, ie. the one that properly has the range specified in the scan query?
>>>>
>>>> If data for range [x] is on node [a] but node [a] is no longer
>>> considered an endpoint for range [x], it will never receive a request to
>>> serve range [x]. => OK
>>>
>>>>
>>>>    - After decommissioning a node, is it advisable to run cleanup in
>>>>    the remaining nodes? The consequences of not to run are the same of not
to
>>>>    run when adding a node?
>>>>
>>>> Cleanup is only for the node which lost a range. In decommission case,
>>> no live nodes lost a range, only some nodes gained one. => OK
>>>
>>> =Rob
>>>
>>
>>
>

Mime
View raw message