Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A0A08D725 for ; Fri, 8 Mar 2013 18:50:31 +0000 (UTC) Received: (qmail 19420 invoked by uid 500); 8 Mar 2013 18:50:28 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 19389 invoked by uid 500); 8 Mar 2013 18:50:28 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 19381 invoked by uid 99); 8 Mar 2013 18:50:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Mar 2013 18:50:28 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [192.174.58.134] (HELO XEDGEA.nrel.gov) (192.174.58.134) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Mar 2013 18:50:23 +0000 Received: from XHUBB.nrel.gov (10.20.4.59) by XEDGEA.nrel.gov (192.174.58.134) with Microsoft SMTP Server (TLS) id 8.3.245.1; Fri, 8 Mar 2013 11:49:57 -0700 Received: from MAILBOX2.nrel.gov ([fe80::19a0:6c19:6421:12f]) by XHUBB.nrel.gov ([::1]) with mapi; Fri, 8 Mar 2013 11:50:00 -0700 From: "Hiller, Dean" To: "user@cassandra.apache.org" , Wei Zhu Date: Fri, 8 Mar 2013 11:50:06 -0700 Subject: Re: Size Tiered -> Leveled Compaction Thread-Topic: Size Tiered -> Leveled Compaction Thread-Index: Ac4cLbzBcGeoU+BGRmq3uWcCZgCRGw== Message-ID: In-Reply-To: <1362766309.97631.YahooMailNeo@web160904.mail.bf1.yahoo.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.3.1.130117 acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org +1 (I would love to know this info). Dean From: Wei Zhu > Reply-To: "user@cassandra.apache.org" >, Wei Zhu > Date: Friday, March 8, 2013 11:11 AM To: "user@cassandra.apache.org" > Subject: Re: Size Tiered -> Leveled Compaction I have the same wonder. We started with the default 5M and the compaction after repair takes too lo= ng on 200G node, so we increase the size to 10M sort of arbitrarily since t= here is not much documentation around it. Our tech op team still thinks the= re are too many files in one directory. To fulfill the guidelines from them= (don't remember the exact number, but something in the range of 50K files)= , we will need to increase the size to around 50M. I think the latency of = opening one file is not impacted much by the number of files in one directo= ry for the modern file system. But "ls" and other operations suffer. Anyway, I asked about the side effect of the bigger SSTable in IRC, someone= was mentioning during read, C* reads the whole SSTable from disk in order = to access the row which causes more disk IO compared with the smaller SSTab= le. I don't know enough about the internal of the Cassandra, not sure wheth= er it's the case or not. If that is the case (with question mark) , the SST= able or the row is kept in the memory? Hope someone can confirm the theory = here. Or I have to dig in to the source code to find it. Another concern is during repair, does it stream the whole SSTable or the p= artial of it when mismatch is detected? I see the claim for both, can someo= ne please confirm also? The last thing is the effectiveness of the parallel LCS on 1.2. It takes qu= ite some time for the compaction to finish after repair for LCS for 1.1.X. = Both CPU and disk Util is low during the compaction which means LCS doesn't= fully utilized resource. It will make the life easier if the issue is add= ressed in 1.2. Bottom line is that there is not much documentation/guideline/successful st= ory around LCS although it sounds beautiful on paper. Thanks. -Wei ________________________________ From: Alain RODRIGUEZ > To: user@cassandra.apache.org Cc: Wei Zhu > Sent: Friday, March 8, 2013 1:25 AM Subject: Re: Size Tiered -> Leveled Compaction I'm still wondering about how to chose the size of the sstable under LCS. D= efaul is 5MB, people use to configure it to 10MB and now you configure it a= t 128MB. What are the benefits or inconveniants of a very small size (let's= say 5 MB) vs big size (like 128MB) ? Alain 2013/3/8 Al Tobey > We saw the exactly the same thing as Wei Zhu, > 100k tables in a directory = causing all kinds of issues. We're running 128MiB ssTables with LCS and ha= ve disabled compaction throttling. 128MiB was chosen to get file counts un= der control and reduce the number of files C* has to manage & search. I jus= t looked and a ~250GiB node is using about 10,000 files, which is quite man= ageable. This configuration is running smoothly in production under mixed = read/write load. We're on RAID0 across 6 15k drives per machine. When we migrated data to th= is cluster we were pushing well over 26k/s+ inserts with CL_QUORUM. With co= mpaction throttling enabled at any rate it just couldn't keep up. With thro= ttling off, it runs smoothly and does not appear to have an impact on our a= pplications, so we always leave it off, even in EC2. An 8GiB heap is too s= mall for this config on 1.1. YMMV. -Al Tobey On Thu, Feb 14, 2013 at 12:51 PM, Wei Zhu > wrote: I haven't tried to switch compaction strategy. We started with LCS. For us, after massive data imports (5000 w/seconds for 6 days), the first r= epair is painful since there is quite some data inconsistency. For 150G nod= es, repair brought in about 30 G and created thousands of pending compactio= ns. It took almost a day to clear those. Just be prepared LCS is really slo= w in 1.1.X. System performance degrades during that time since reads could = go to more SSTable, we see 20 SSTable lookup for one read.. (We tried every= thing we can and couldn't speed it up. I think it's single threaded.... and= it's not recommended to turn on multithread compaction. We even tried that= , it didn't help )There is parallel LCS in 1.2 which is supposed to allevia= te the pain. Haven't upgraded yet, hope it works:) http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 Since our cluster is not write intensive, only 100 w/seconds. I don't see a= ny pending compactions during regular operation. One thing worth mentioning is the size of the SSTable, default is 5M which = is kind of small for 200G (all in one CF) data set, and we are on SSD. It = more than 150K files in one directory. (200G/5M =3D 40K SSTable and each S= STable creates 4 files on disk) You might want to watch that and decide th= e SSTable size. By the way, there is no concept of Major compaction for LCS. Just for fun, = you can look at a file called $CFName.json in your data directory and it te= lls you the SSTable distribution among different levels. -Wei ________________________________ From: Charles Brophy > To: user@cassandra.apache.org Sent: Thursday, February 14, 2013 8:29 AM Subject: Re: Size Tiered -> Leveled Compaction I second these questions: we've been looking into changing some of our CFs = to use leveled compaction as well. If anybody here has the wisdom to answer= them it would be of wonderful help. Thanks Charles On Wed, Feb 13, 2013 at 7:50 AM, Mike > wrote: Hello, I'm investigating the transition of some of our column families from Size T= iered -> Leveled Compaction. I believe we have some high-read-load column = families that would benefit tremendously. I've stood up a test DB Node to investigate the transition. I successfully= alter the column family, and I immediately noticed a large number (1000+) = pending compaction tasks become available, but no compaction get executed. I tried running "nodetool sstableupgrade" on the column family, and the com= paction tasks don't move. I also notice no changes to the size and distribution of the existing SSTab= les. I then run a major compaction on the column family. All pending compaction= tasks get run, and the SSTables have a distribution that I would expect fr= om LeveledCompaction (lots and lots of 10MB files). Couple of questions: 1) Is a major compaction required to transition from size-tiered to leveled= compaction? 2) Are major compactions as much of a concern for LeveledCompaction as thei= r are for Size Tiered? All the documentation I found concerning transitioning from Size Tiered to = Level compaction discuss the alter table cql command, but I haven't found t= oo much on what else needs to be done after the schema change. I did these tests with Cassandra 1.1.9. Thanks, -Mike