cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "B. Todd Burruss" <bto...@gmail.com>
Subject Re: unbalanced ring
Date Fri, 12 Oct 2012 00:55:31 GMT
do the size of your rows vary a lot?  if so, then this could explain why
some nodes are "unlucky" and getting the large nodes.  you only have 3
nodes so definitely can happen.


On Thu, Oct 11, 2012 at 2:24 PM, Alain RODRIGUEZ <arodrime@gmail.com> wrote:

> I understood it that way too but I explained myself very poorly in my last
> message. Answering fast and not in my mother tongue is sometimes quite hard
> and results on poor explanations.
>
> Sorry about that and thank you Jim for that clarification.
>
> Alain
>
> 2012/10/11 Jim Cistaro <jcistaro@netflix.com>
>
>>  I think bad effect #1 needs clarification.
>>
>>  This only suspends minor compactions involving that 1 big file.  As new
>> sstables are flushed, they are of the same small size and they will
>> eventually compact together.  So that one big file will sit idle (as far as
>> compaction go) until you build some eventualluy compact the new files into
>> ones as big as tht major compacted sstable.  That is why those who do major
>> compactions once normally end up doing them periodically to cause
>> compaction with the previous large sstable.
>>
>>  I hope this helps,
>> jc
>>
>>   From: Alain RODRIGUEZ <arodrime@gmail.com>
>> Reply-To: <user@cassandra.apache.org>
>> Date: Thu, 11 Oct 2012 17:30:53 +0200
>>
>> To: <user@cassandra.apache.org>
>> Subject: Re: unbalanced ring
>>
>>  @Tamar
>>
>>  Bad effects are :
>>
>>  1 - Disabling for a *long* time your minor compactions (they need
>> SSTable about the same size to be triggered).
>> 2 - High cpu load during the compaction (which can be quite long).
>>
>>  Good effects :
>>
>>  1 - Reduce the size of your data.
>> 2 - Boost read performances.
>>
>>  @B. Todd Burruss
>>
>>  What can we do if cleanup doesn't remove any data and so doesn't
>> balance the data partition ?
>> We both have well balanced ring and unbalanced data...
>>
>>  Alain
>>
>>  2012/10/11 Viktor Jevdokimov <Viktor.Jevdokimov@adform.com>
>>
>>>  In our case, we use TTL and need to keep amount of data as low as
>>> possible to fit RAM, so data have to be deleted somehow.****
>>>
>>> While SSTables are growing, largest will wait long time for minor
>>> compaction, so we do major compaction every night.****
>>>
>>> ** **
>>>
>>> ** **
>>>    Best regards / Pagarbiai
>>> *Viktor Jevdokimov*
>>> Senior Developer
>>>
>>> Email: Viktor.Jevdokimov@adform.com
>>> Phone: +370 5 212 3063, Mobile: +370 650 19588, Fax +370 5 261 0453
>>> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
>>> Follow us on Twitter: @adforminsider<http://twitter.com/#!/adforminsider>
>>> What is Adform: watch this short video <http://vimeo.com/adform/display>
>>>  [image: Adform News] <http://www.adform.com>
>>> * *
>>> *Visit us at* IAB RTB workshop
>>> October 11, 4 pm in Sala Rossa
>>> [image: iab forum]<http://www.iabforum.it/iab-forum-milano-2012/agenda/11-ottobre/>
>>>
>>> Disclaimer: The information contained in this message and attachments is
>>> intended solely for the attention and use of the named addressee and may be
>>> confidential. If you are not the intended recipient, you are reminded that
>>> the information remains the property of the sender. You must not use,
>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>> received this message in error, please contact the sender immediately and
>>> irrevocably delete this message and any copies.
>>>
>>>   *From:* Tamar Fraenkel [mailto:tamar@tok-media.com]
>>> *Sent:* Thursday, October 11, 2012 10:57
>>>
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Re: unbalanced ring****
>>>
>>>   ** **
>>>
>>> Hi!
>>> All that left me confused...
>>> like Alain, I read DataStax wanings. Now Victor says it is possible
>>> without bad effects.
>>> 1. Under what conditions would you recommend major compaction?
>>> 2. If I do go that route, would I have to run periodic / nightly
>>> compactions from now on?
>>> 3. What will be the price of #2?
>>> Thanks,
>>>
>>> ****
>>>
>>> *Tamar Fraenkel *
>>> Senior Software Engineer, TOK Media ****
>>>
>>> [image: Inline image 1]****
>>>
>>>
>>> tamar@tok-media.com
>>> Tel:   +972 2 6409736
>>> Mob:  +972 54 8356490
>>> Fax:   +972 2 5612956 ****
>>>
>>> ** **
>>>
>>> ** **
>>>
>>>
>>>
>>> ****
>>>
>>> On Thu, Oct 11, 2012 at 8:41 AM, Viktor Jevdokimov <
>>> Viktor.Jevdokimov@adform.com> wrote:****
>>>
>>> To run, or not to run? All this depends on use case. There’re no
>>> problems running major compactions (we do it nightly) in one case, there
>>> could be problems in another. Just need to understand, how everything works.
>>> ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>> Best regards / Pagarbiai****
>>>
>>> *Viktor Jevdokimov*****
>>>
>>> Senior Developer****
>>>
>>> ** **
>>>
>>> Email: Viktor.Jevdokimov@adform.com****
>>>
>>> Phone: +370 5 212 3063, Mobile: +370 650 19588, Fax +370 5 261 0453****
>>>
>>> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania****
>>>
>>> Follow us on Twitter: @adforminsider<http://twitter.com/#!/adforminsider>
>>> ****
>>>
>>> What is Adform: watch this short video <http://vimeo.com/adform/display>
>>> ****
>>>
>>> [image: Adform News] <http://www.adform.com>****
>>>
>>> * *   ****
>>>
>>> *Visit us at* IAB RTB workshop ****
>>>
>>> October 11, 4 pm in Sala Rossa ****
>>>
>>> [image: iab forum]<http://www.iabforum.it/iab-forum-milano-2012/agenda/11-ottobre/>
>>> ****
>>>
>>>
>>> Disclaimer: The information contained in this message and attachments is
>>> intended solely for the attention and use of the named addressee and may be
>>> confidential. If you are not the intended recipient, you are reminded that
>>> the information remains the property of the sender. You must not use,
>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>> received this message in error, please contact the sender immediately and
>>> irrevocably delete this message and any copies. ****
>>>
>>> ** **
>>>
>>> *From:* Alain RODRIGUEZ [mailto:arodrime@gmail.com]
>>> *Sent:* Thursday, October 11, 2012 09:17
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Re: unbalanced ring****
>>>
>>>  ****
>>>
>>> Tamar be carefull. Datastax doesn't recommand major compactions in
>>> production environnement.****
>>>
>>>  ****
>>>
>>> If I got it right, performing major compaction will convert all your
>>> SSTables into a big one, improving substantially your reads performence, at
>>> least for a while... The problem is that will disable minor compactions too
>>> (because of the difference of size between this SSTable and the new ones,
>>> if I remeber well). So your reads performance will decrease until your
>>> others SSTable reach the size of this big one you've created or until you
>>> run an other major compaction, transforming them into a maintenance normal
>>> process like repair is.****
>>>
>>>  ****
>>>
>>> But, knowing that, I still don't know if we both (Tamar and I) shouldn't
>>> run it anyway (In my case it will greatly decrease the size of my data  133
>>> GB -> 35GB and maybe load the cluster evenly...)****
>>>
>>>  ****
>>>
>>> Alain****
>>>
>>>  ****
>>>
>>> 2012/10/10 B. Todd Burruss <btoddb@gmail.com>****
>>>
>>> it should not have any other impact except increased usage of system
>>> resources.****
>>>
>>>  ****
>>>
>>> and i suppose, cleanup would not have an affect (over normal compaction)
>>> if all nodes contain the same data****
>>>
>>>  ****
>>>
>>> On Wed, Oct 10, 2012 at 12:12 PM, Tamar Fraenkel <tamar@tok-media.com>
>>> wrote:****
>>>
>>> Hi!
>>> Apart from being heavy load (the compact), will it have other effects?
>>> Also, will cleanup help if I have replication factor = number of nodes?
>>> Thanks****
>>>
>>>
>>> ****
>>>
>>> *Tamar Fraenkel *
>>> Senior Software Engineer, TOK Media ****
>>>
>>> [image: Inline image 1]****
>>>
>>>
>>> tamar@tok-media.com
>>> Tel:   +972 2 6409736
>>> Mob:  +972 54 8356490
>>> Fax:   +972 2 5612956 ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>> ** **
>>>
>>> On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss <btoddb@gmail.com>
>>> wrote:****
>>>
>>> major compaction in production is fine, however it is a heavy operation
>>> on the node and will take I/O and some CPU.****
>>>
>>>  ****
>>>
>>> the only time i have seen this happen is when i have changed the tokens
>>> in the ring, like "nodetool movetoken".  cassandra does not auto-delete
>>> data that it doesn't use anymore just in case you want to move the tokens
>>> again or otherwise "undo".****
>>>
>>>  ****
>>>
>>> try "nodetool cleanup"****
>>>
>>>  ****
>>>
>>> On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ <arodrime@gmail.com>
>>> wrote:****
>>>
>>> Hi,****
>>>
>>>  ****
>>>
>>> Same thing here: ****
>>>
>>>  ****
>>>
>>> 2 nodes, RF = 2. RCL = 1, WCL = 1.****
>>>
>>> Like Tamar I never ran a major compaction and repair once a week each
>>> node.****
>>>
>>>  ****
>>>
>>> 10.59.21.241    eu-west     1b          Up     Normal  133.02 GB
>>> 50.00%              0****
>>>
>>> 10.58.83.109    eu-west     1b          Up     Normal  98.12 GB
>>>  50.00%              85070591730234615865843651857942052864****
>>>
>>>  ****
>>>
>>> What phenomena could explain the result above ?****
>>>
>>>  ****
>>>
>>> By the way, I have copy the data and import it in a one node dev
>>> cluster. There I have run a major compaction and the size of my data has
>>> been significantly reduced (to about 32 GB instead of 133 GB). ****
>>>
>>>  ****
>>>
>>> How is that possible ?****
>>>
>>> Do you think that if I run major compaction in both nodes it will
>>> balance the load evenly ?****
>>>
>>> Should I run major compaction in production ?****
>>>
>>>  ****
>>>
>>> 2012/10/10 Tamar Fraenkel <tamar@tok-media.com>****
>>>
>>> Hi!
>>> I am re-posting this, now that I have more data and still *unbalanced
>>> ring*:
>>>
>>> 3 nodes,
>>> RF=3, RCL=WCL=QUORUM****
>>>
>>>
>>>
>>> Address         DC          Rack        Status State   Load
>>> Owns    Token
>>>
>>> 113427455640312821154458202477256070485****
>>>
>>> x.x.x.x    us-east     1c          Up     Normal  24.02 GB
>>> 33.33%  0
>>> y.y.y.y     us-east     1c          Up     Normal  33.45 GB
>>> 33.33%  56713727820156410577229101238628035242
>>> z.z.z.z    us-east     1c          Up     Normal  29.85 GB
>>> 33.33%  113427455640312821154458202477256070485
>>>
>>> repair runs weekly.
>>> I don't run nodetool compact as I read that this may cause the minor
>>> regular compactions not to run and then I will have to run compact
>>> manually. Is that right?
>>>
>>> Any idea if this means something wrong, and if so, how to solve?****
>>>
>>>
>>>
>>> Thanks,****
>>>
>>> *
>>> Tamar Fraenkel *
>>> Senior Software Engineer, TOK Media ****
>>>
>>> [image: Inline image 1]****
>>>
>>>
>>> tamar@tok-media.com
>>> Tel:   +972 2 6409736
>>> Mob:  +972 54 8356490
>>> Fax:   +972 2 5612956 ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>> ** **
>>>
>>> On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel <tamar@tok-media.com>
>>> wrote:****
>>>
>>> Thanks, I will wait and see as data accumulates.****
>>>
>>> Thanks,****
>>>
>>>
>>> ****
>>>
>>> *Tamar Fraenkel *
>>> Senior Software Engineer, TOK Media ****
>>>
>>> [image: Inline image 1]****
>>>
>>>
>>> tamar@tok-media.com
>>> Tel:   +972 2 6409736
>>> Mob:  +972 54 8356490
>>> Fax:   +972 2 5612956 ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>> ** **
>>>
>>> On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen <robin@us2.nl> wrote:****
>>>
>>> Cassandra is built to store tons and tons of data. In my opinion roughly
>>> ~ 6MB per node is not enough data to allow it to become a fully balanced
>>> cluster.****
>>>
>>>  ****
>>>
>>> 2012/3/27 Tamar Fraenkel <tamar@tok-media.com>****
>>>
>>> This morning I have****
>>>
>>>  nodetool ring -h localhost****
>>>
>>> Address         DC          Rack        Status State   Load
>>>  Owns    Token****
>>>
>>>
>>>        113427455640312821154458202477256070485****
>>>
>>> 10.34.158.33    us-east     1c          Up     Normal  5.78 MB
>>> 33.33%  0****
>>>
>>> 10.38.175.131   us-east     1c          Up     Normal  7.23 MB
>>> 33.33%  56713727820156410577229101238628035242****
>>>
>>> 10.116.83.10    us-east     1c          Up     Normal  5.02 MB
>>> 33.33%  113427455640312821154458202477256070485****
>>>
>>>  ****
>>>
>>> Version is 1.0.8.****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>> *Tamar Fraenkel *
>>> Senior Software Engineer, TOK Media ****
>>>
>>> [image: Inline image 1]****
>>>
>>>
>>> tamar@tok-media.com
>>> Tel:   +972 2 6409736
>>> Mob:  +972 54 8356490
>>> Fax:   +972 2 5612956 ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>> ** **
>>>
>>> On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe <watanabe.maki@gmail.com>
>>> wrote:****
>>>
>>> What version are you using?****
>>>
>>> Anyway try nodetool repair & compact.****
>>>
>>>  ****
>>>
>>> maki****
>>>
>>>  ****
>>>
>>> 2012/3/26 Tamar Fraenkel <tamar@tok-media.com>****
>>>
>>> Hi!****
>>>
>>> I created Amazon ring using datastax image and started filling the db.**
>>> **
>>>
>>> The cluster seems un-balanced.****
>>>
>>>  ****
>>>
>>> nodetool ring returns:****
>>>
>>> Address         DC          Rack        Status State   Load
>>>  Owns    Token****
>>>
>>>
>>>        113427455640312821154458202477256070485****
>>>
>>> 10.34.158.33    us-east     1c          Up     Normal  514.29 KB
>>> 33.33%  0****
>>>
>>> 10.38.175.131   us-east     1c          Up     Normal  1.5 MB
>>>  33.33%  56713727820156410577229101238628035242****
>>>
>>> 10.116.83.10    us-east     1c          Up     Normal  1.5 MB
>>>  33.33%  113427455640312821154458202477256070485****
>>>
>>>  ****
>>>
>>> [default@tok] describe;****
>>>
>>> Keyspace: tok:****
>>>
>>>   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy****
>>>
>>>   Durable Writes: true****
>>>
>>>     Options: [replication_factor:2]****
>>>
>>>  ****
>>>
>>> [default@tok] describe cluster;****
>>>
>>> Cluster Information:****
>>>
>>>    Snitch: org.apache.cassandra.locator.Ec2Snitch****
>>>
>>>    Partitioner: org.apache.cassandra.dht.RandomPartitioner****
>>>
>>>    Schema versions:****
>>>
>>>         4687d620-7664-11e1-0000-1bcb936807ff: [10.38.175.131,
>>> 10.34.158.33, 10.116.83.10]****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>> Any idea what is the cause?****
>>>
>>> I am running similar code on local ring and it is balanced.****
>>>
>>>  ****
>>>
>>> How can I fix this?****
>>>
>>>  ****
>>>
>>> Thanks,****
>>>
>>>
>>> ****
>>>
>>> *Tamar Fraenkel *
>>> Senior Software Engineer, TOK Media ****
>>>
>>> [image: Inline image 1]****
>>>
>>>
>>> tamar@tok-media.com
>>> Tel:   +972 2 6409736
>>> Mob:  +972 54 8356490
>>> Fax:   +972 2 5612956 ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>
>>>
>>> ****
>>>
>>>  ****
>>>
>>> --
>>> With kind regards,****
>>>
>>>  ****
>>>
>>> Robin Verlangen****
>>>
>>> www.robinverlangen.nl****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>>  ****
>>>
>>> ** **
>>>
>>
>>
>

Mime
View raw message