incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rodrigo Felix <>
Subject General doubts about bootstrap
Date Sat, 06 Jul 2013 20:50:25 GMT

   I'm facing some problems and if you could help on some of them I'd thank
   *Environment:* 2 seeds and 2 other nodes, all installed on m1.large EC2
instances. Each seed starts with about 1.7GB of data. Default cassandra

   - Is it normal to take about 9 minutes to add a new node? Follows the
   log generated by a script to add a new node.

[06/07/2013 20:07:53] Remove all data stored in the Cassandra node
[06/07/2013 20:07:54] [OK] All data successfully removed
[06/07/2013 20:07:54] Setting seeds on cassandra.yml
[06/07/2013 20:07:54] [OK] seeds successfully set
[06/07/2013 20:07:54] Setting listen_address on cassandra.yml
[06/07/2013 20:07:54] [OK] listen_address successfully set
[06/07/2013 20:07:54] Setting initial_token on cassandra.yml
[06/07/2013 20:07:54] [OK] initial_token successfully set
*[06/07/2013 20:07:54] Starting cassandra...*
*[06/07/2013 20:16:36] [OK] Cassandra started*
[06/07/2013 20:16:37] Changing token of i-5cfc082f
[06/07/2013 20:18:00] [OK] Token of i-5cfc082f successfully set to
[06/07/2013 20:18:00] Cleaning up i-5cfc082f
[06/07/2013 20:20:13] Clean up of i-5cfc082f successfully finished
[06/07/2013 20:20:13] Machine added

   - Is there a way to reduce the time to start cassandra?
   - Sometimes cleanup operation takes make minutes (about 10). Is this
   normal since the amount of data is small (1.7gb at maximum / seed)?
   - Considering that I have two seeds in the beginning, their tokens are 0
   and 85070591730234615865843651857942052864. When I add a new machine, do I
   need to execute move and cleanup on both seeds? Nowadays, I'm running
   cleanup on seed 0, move + cleanup on the other seed and neither move nor
   cleanup on the just added node. Is this OK?
   - What if I do not run cleanup in any existing node when adding or
   removing a node? Is the data that was not "cleaned up" still available if I
   send a scan, for instance, and the scan range is still in the node but it
   wouldn't be there if I had run cleanup? Data would be gather from other
   node, ie. the one that properly has the range specified in the scan query?
   - After decommissioning a node, is it advisable to run cleanup in the
   remaining nodes? The consequences of not to run are the same of not to run
   when adding a node?

   Thank you very much in advance.


*Rodrigo Felix de Almeida*
LSBD - Universidade Federal do CearĂ¡
Project Manager

View raw message