do the size of your rows vary a lot? if so, then this could explain why some nodes are "unlucky" and getting the large nodes. you only have 3 nodes so definitely can happen. On Thu, Oct 11, 2012 at 2:24 PM, Alain RODRIGUEZ wrote: > I understood it that way too but I explained myself very poorly in my last > message. Answering fast and not in my mother tongue is sometimes quite hard > and results on poor explanations. > > Sorry about that and thank you Jim for that clarification. > > Alain > > 2012/10/11 Jim Cistaro > >> I think bad effect #1 needs clarification. >> >> This only suspends minor compactions involving that 1 big file. As new >> sstables are flushed, they are of the same small size and they will >> eventually compact together. So that one big file will sit idle (as far as >> compaction go) until you build some eventualluy compact the new files into >> ones as big as tht major compacted sstable. That is why those who do major >> compactions once normally end up doing them periodically to cause >> compaction with the previous large sstable. >> >> I hope this helps, >> jc >> >> From: Alain RODRIGUEZ >> Reply-To: >> Date: Thu, 11 Oct 2012 17:30:53 +0200 >> >> To: >> Subject: Re: unbalanced ring >> >> @Tamar >> >> Bad effects are : >> >> 1 - Disabling for a *long* time your minor compactions (they need >> SSTable about the same size to be triggered). >> 2 - High cpu load during the compaction (which can be quite long). >> >> Good effects : >> >> 1 - Reduce the size of your data. >> 2 - Boost read performances. >> >> @B. Todd Burruss >> >> What can we do if cleanup doesn't remove any data and so doesn't >> balance the data partition ? >> We both have well balanced ring and unbalanced data... >> >> Alain >> >> 2012/10/11 Viktor Jevdokimov >> >>> In our case, we use TTL and need to keep amount of data as low as >>> possible to fit RAM, so data have to be deleted somehow.**** >>> >>> While SSTables are growing, largest will wait long time for minor >>> compaction, so we do major compaction every night.**** >>> >>> ** ** >>> >>> ** ** >>> Best regards / Pagarbiai >>> *Viktor Jevdokimov* >>> Senior Developer >>> >>> Email: Viktor.Jevdokimov@adform.com >>> Phone: +370 5 212 3063, Mobile: +370 650 19588, Fax +370 5 261 0453 >>> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania >>> Follow us on Twitter: @adforminsider >>> What is Adform: watch this short video >>> [image: Adform News] >>> * * >>> *Visit us at* IAB RTB workshop >>> October 11, 4 pm in Sala Rossa >>> [image: iab forum] >>> >>> Disclaimer: The information contained in this message and attachments is >>> intended solely for the attention and use of the named addressee and may be >>> confidential. If you are not the intended recipient, you are reminded that >>> the information remains the property of the sender. You must not use, >>> disclose, distribute, copy, print or rely on this e-mail. If you have >>> received this message in error, please contact the sender immediately and >>> irrevocably delete this message and any copies. >>> >>> *From:* Tamar Fraenkel [mailto:tamar@tok-media.com] >>> *Sent:* Thursday, October 11, 2012 10:57 >>> >>> *To:* user@cassandra.apache.org >>> *Subject:* Re: unbalanced ring**** >>> >>> ** ** >>> >>> Hi! >>> All that left me confused... >>> like Alain, I read DataStax wanings. Now Victor says it is possible >>> without bad effects. >>> 1. Under what conditions would you recommend major compaction? >>> 2. If I do go that route, would I have to run periodic / nightly >>> compactions from now on? >>> 3. What will be the price of #2? >>> Thanks, >>> >>> **** >>> >>> *Tamar Fraenkel * >>> Senior Software Engineer, TOK Media **** >>> >>> [image: Inline image 1]**** >>> >>> >>> tamar@tok-media.com >>> Tel: +972 2 6409736 >>> Mob: +972 54 8356490 >>> Fax: +972 2 5612956 **** >>> >>> ** ** >>> >>> ** ** >>> >>> >>> >>> **** >>> >>> On Thu, Oct 11, 2012 at 8:41 AM, Viktor Jevdokimov < >>> Viktor.Jevdokimov@adform.com> wrote:**** >>> >>> To run, or not to run? All this depends on use case. There’re no >>> problems running major compactions (we do it nightly) in one case, there >>> could be problems in another. Just need to understand, how everything works. >>> **** >>> >>> **** >>> >>> **** >>> >>> Best regards / Pagarbiai**** >>> >>> *Viktor Jevdokimov***** >>> >>> Senior Developer**** >>> >>> ** ** >>> >>> Email: Viktor.Jevdokimov@adform.com**** >>> >>> Phone: +370 5 212 3063, Mobile: +370 650 19588, Fax +370 5 261 0453**** >>> >>> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania**** >>> >>> Follow us on Twitter: @adforminsider >>> **** >>> >>> What is Adform: watch this short video >>> **** >>> >>> [image: Adform News] **** >>> >>> * * **** >>> >>> *Visit us at* IAB RTB workshop **** >>> >>> October 11, 4 pm in Sala Rossa **** >>> >>> [image: iab forum] >>> **** >>> >>> >>> Disclaimer: The information contained in this message and attachments is >>> intended solely for the attention and use of the named addressee and may be >>> confidential. If you are not the intended recipient, you are reminded that >>> the information remains the property of the sender. You must not use, >>> disclose, distribute, copy, print or rely on this e-mail. If you have >>> received this message in error, please contact the sender immediately and >>> irrevocably delete this message and any copies. **** >>> >>> ** ** >>> >>> *From:* Alain RODRIGUEZ [mailto:arodrime@gmail.com] >>> *Sent:* Thursday, October 11, 2012 09:17 >>> *To:* user@cassandra.apache.org >>> *Subject:* Re: unbalanced ring**** >>> >>> **** >>> >>> Tamar be carefull. Datastax doesn't recommand major compactions in >>> production environnement.**** >>> >>> **** >>> >>> If I got it right, performing major compaction will convert all your >>> SSTables into a big one, improving substantially your reads performence, at >>> least for a while... The problem is that will disable minor compactions too >>> (because of the difference of size between this SSTable and the new ones, >>> if I remeber well). So your reads performance will decrease until your >>> others SSTable reach the size of this big one you've created or until you >>> run an other major compaction, transforming them into a maintenance normal >>> process like repair is.**** >>> >>> **** >>> >>> But, knowing that, I still don't know if we both (Tamar and I) shouldn't >>> run it anyway (In my case it will greatly decrease the size of my data 133 >>> GB -> 35GB and maybe load the cluster evenly...)**** >>> >>> **** >>> >>> Alain**** >>> >>> **** >>> >>> 2012/10/10 B. Todd Burruss **** >>> >>> it should not have any other impact except increased usage of system >>> resources.**** >>> >>> **** >>> >>> and i suppose, cleanup would not have an affect (over normal compaction) >>> if all nodes contain the same data**** >>> >>> **** >>> >>> On Wed, Oct 10, 2012 at 12:12 PM, Tamar Fraenkel >>> wrote:**** >>> >>> Hi! >>> Apart from being heavy load (the compact), will it have other effects? >>> Also, will cleanup help if I have replication factor = number of nodes? >>> Thanks**** >>> >>> >>> **** >>> >>> *Tamar Fraenkel * >>> Senior Software Engineer, TOK Media **** >>> >>> [image: Inline image 1]**** >>> >>> >>> tamar@tok-media.com >>> Tel: +972 2 6409736 >>> Mob: +972 54 8356490 >>> Fax: +972 2 5612956 **** >>> >>> **** >>> >>> **** >>> >>> ** ** >>> >>> On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss >>> wrote:**** >>> >>> major compaction in production is fine, however it is a heavy operation >>> on the node and will take I/O and some CPU.**** >>> >>> **** >>> >>> the only time i have seen this happen is when i have changed the tokens >>> in the ring, like "nodetool movetoken". cassandra does not auto-delete >>> data that it doesn't use anymore just in case you want to move the tokens >>> again or otherwise "undo".**** >>> >>> **** >>> >>> try "nodetool cleanup"**** >>> >>> **** >>> >>> On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ >>> wrote:**** >>> >>> Hi,**** >>> >>> **** >>> >>> Same thing here: **** >>> >>> **** >>> >>> 2 nodes, RF = 2. RCL = 1, WCL = 1.**** >>> >>> Like Tamar I never ran a major compaction and repair once a week each >>> node.**** >>> >>> **** >>> >>> 10.59.21.241 eu-west 1b Up Normal 133.02 GB >>> 50.00% 0**** >>> >>> 10.58.83.109 eu-west 1b Up Normal 98.12 GB >>> 50.00% 85070591730234615865843651857942052864**** >>> >>> **** >>> >>> What phenomena could explain the result above ?**** >>> >>> **** >>> >>> By the way, I have copy the data and import it in a one node dev >>> cluster. There I have run a major compaction and the size of my data has >>> been significantly reduced (to about 32 GB instead of 133 GB). **** >>> >>> **** >>> >>> How is that possible ?**** >>> >>> Do you think that if I run major compaction in both nodes it will >>> balance the load evenly ?**** >>> >>> Should I run major compaction in production ?**** >>> >>> **** >>> >>> 2012/10/10 Tamar Fraenkel **** >>> >>> Hi! >>> I am re-posting this, now that I have more data and still *unbalanced >>> ring*: >>> >>> 3 nodes, >>> RF=3, RCL=WCL=QUORUM**** >>> >>> >>> >>> Address DC Rack Status State Load >>> Owns Token >>> >>> 113427455640312821154458202477256070485**** >>> >>> x.x.x.x us-east 1c Up Normal 24.02 GB >>> 33.33% 0 >>> y.y.y.y us-east 1c Up Normal 33.45 GB >>> 33.33% 56713727820156410577229101238628035242 >>> z.z.z.z us-east 1c Up Normal 29.85 GB >>> 33.33% 113427455640312821154458202477256070485 >>> >>> repair runs weekly. >>> I don't run nodetool compact as I read that this may cause the minor >>> regular compactions not to run and then I will have to run compact >>> manually. Is that right? >>> >>> Any idea if this means something wrong, and if so, how to solve?**** >>> >>> >>> >>> Thanks,**** >>> >>> * >>> Tamar Fraenkel * >>> Senior Software Engineer, TOK Media **** >>> >>> [image: Inline image 1]**** >>> >>> >>> tamar@tok-media.com >>> Tel: +972 2 6409736 >>> Mob: +972 54 8356490 >>> Fax: +972 2 5612956 **** >>> >>> **** >>> >>> **** >>> >>> ** ** >>> >>> On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel >>> wrote:**** >>> >>> Thanks, I will wait and see as data accumulates.**** >>> >>> Thanks,**** >>> >>> >>> **** >>> >>> *Tamar Fraenkel * >>> Senior Software Engineer, TOK Media **** >>> >>> [image: Inline image 1]**** >>> >>> >>> tamar@tok-media.com >>> Tel: +972 2 6409736 >>> Mob: +972 54 8356490 >>> Fax: +972 2 5612956 **** >>> >>> **** >>> >>> **** >>> >>> ** ** >>> >>> On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen wrote:**** >>> >>> Cassandra is built to store tons and tons of data. In my opinion roughly >>> ~ 6MB per node is not enough data to allow it to become a fully balanced >>> cluster.**** >>> >>> **** >>> >>> 2012/3/27 Tamar Fraenkel **** >>> >>> This morning I have**** >>> >>> nodetool ring -h localhost**** >>> >>> Address DC Rack Status State Load >>> Owns Token**** >>> >>> >>> 113427455640312821154458202477256070485**** >>> >>> 10.34.158.33 us-east 1c Up Normal 5.78 MB >>> 33.33% 0**** >>> >>> 10.38.175.131 us-east 1c Up Normal 7.23 MB >>> 33.33% 56713727820156410577229101238628035242**** >>> >>> 10.116.83.10 us-east 1c Up Normal 5.02 MB >>> 33.33% 113427455640312821154458202477256070485**** >>> >>> **** >>> >>> Version is 1.0.8.**** >>> >>> **** >>> >>> **** >>> >>> *Tamar Fraenkel * >>> Senior Software Engineer, TOK Media **** >>> >>> [image: Inline image 1]**** >>> >>> >>> tamar@tok-media.com >>> Tel: +972 2 6409736 >>> Mob: +972 54 8356490 >>> Fax: +972 2 5612956 **** >>> >>> **** >>> >>> **** >>> >>> ** ** >>> >>> On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe >>> wrote:**** >>> >>> What version are you using?**** >>> >>> Anyway try nodetool repair & compact.**** >>> >>> **** >>> >>> maki**** >>> >>> **** >>> >>> 2012/3/26 Tamar Fraenkel **** >>> >>> Hi!**** >>> >>> I created Amazon ring using datastax image and started filling the db.** >>> ** >>> >>> The cluster seems un-balanced.**** >>> >>> **** >>> >>> nodetool ring returns:**** >>> >>> Address DC Rack Status State Load >>> Owns Token**** >>> >>> >>> 113427455640312821154458202477256070485**** >>> >>> 10.34.158.33 us-east 1c Up Normal 514.29 KB >>> 33.33% 0**** >>> >>> 10.38.175.131 us-east 1c Up Normal 1.5 MB >>> 33.33% 56713727820156410577229101238628035242**** >>> >>> 10.116.83.10 us-east 1c Up Normal 1.5 MB >>> 33.33% 113427455640312821154458202477256070485**** >>> >>> **** >>> >>> [default@tok] describe;**** >>> >>> Keyspace: tok:**** >>> >>> Replication Strategy: org.apache.cassandra.locator.SimpleStrategy**** >>> >>> Durable Writes: true**** >>> >>> Options: [replication_factor:2]**** >>> >>> **** >>> >>> [default@tok] describe cluster;**** >>> >>> Cluster Information:**** >>> >>> Snitch: org.apache.cassandra.locator.Ec2Snitch**** >>> >>> Partitioner: org.apache.cassandra.dht.RandomPartitioner**** >>> >>> Schema versions:**** >>> >>> 4687d620-7664-11e1-0000-1bcb936807ff: [10.38.175.131, >>> 10.34.158.33, 10.116.83.10]**** >>> >>> **** >>> >>> **** >>> >>> Any idea what is the cause?**** >>> >>> I am running similar code on local ring and it is balanced.**** >>> >>> **** >>> >>> How can I fix this?**** >>> >>> **** >>> >>> Thanks,**** >>> >>> >>> **** >>> >>> *Tamar Fraenkel * >>> Senior Software Engineer, TOK Media **** >>> >>> [image: Inline image 1]**** >>> >>> >>> tamar@tok-media.com >>> Tel: +972 2 6409736 >>> Mob: +972 54 8356490 >>> Fax: +972 2 5612956 **** >>> >>> **** >>> >>> **** >>> >>> **** >>> >>> **** >>> >>> **** >>> >>> >>> >>> **** >>> >>> **** >>> >>> -- >>> With kind regards,**** >>> >>> **** >>> >>> Robin Verlangen**** >>> >>> www.robinverlangen.nl**** >>> >>> **** >>> >>> **** >>> >>> **** >>> >>> **** >>> >>> **** >>> >>> **** >>> >>> **** >>> >>> **** >>> >>> ** ** >>> >> >> >