Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CA2FB9BD2 for ; Wed, 21 Mar 2012 17:10:07 +0000 (UTC) Received: (qmail 46995 invoked by uid 500); 21 Mar 2012 17:10:05 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 46973 invoked by uid 500); 21 Mar 2012 17:10:05 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 46965 invoked by uid 99); 21 Mar 2012 17:10:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Mar 2012 17:10:05 +0000 X-ASF-Spam-Status: No, hits=-0.5 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tivv00@gmail.com designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-we0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Mar 2012 17:09:59 +0000 Received: by werb10 with SMTP id b10so1352175wer.31 for ; Wed, 21 Mar 2012 10:09:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=hlNtKRE50/o7whL4uQSUoO1Pyjp/ro/Mor8U6ICyIV4=; b=RhplHQ329olDNS+e7gLNRFc/0YT3V8lXf10WRz5uhjrJ9Zofz+BtuQAc7P7HIDXdBy PUe+DY/JG/Heaam6q6gFk+hzeIOHBnBsOOCAalvJb0mvYbI3G5gJcTDc9fk/FGvhDDcW BGwR21WNACqTJhTrjiCTEGAWvNw21t7GweXSocEzEYe48DPSvE7j7EupwVBmCw57SBSH hIPLVwyldqiYPmZ5ofwXSFBvGgD6OM0T5dowT4rpXRysyoXOMX7rB8z7JX/6D29U7Ni+ FGQG12/KdrL+PhWgC6BEoGvPkdsEWBKHeRTObCwV9WJcF/XGbbSWp04I9c5kQC9TGbx9 iw6A== Received: by 10.180.95.129 with SMTP id dk1mr12297844wib.3.1332349778391; Wed, 21 Mar 2012 10:09:38 -0700 (PDT) Received: from [10.64.1.26] ([94.45.140.16]) by mx.google.com with ESMTPS id o2sm9633463wiv.11.2012.03.21.10.09.36 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 21 Mar 2012 10:09:37 -0700 (PDT) Message-ID: <4F6A0B52.2020400@gmail.com> Date: Wed, 21 Mar 2012 19:09:38 +0200 From: Vitalii Tymchyshyn User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.27) Gecko/20120216 Lightning/1.0b2 Thunderbird/3.1.19 MIME-Version: 1.0 To: user@cassandra.apache.org CC: A J Subject: Re: Max # of CFs References: <4F689D92.8080509@gmail.com> <4F69D016.7000505@gmail.com> In-Reply-To: Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org There is a forced flusher that kicks in when your heap becomes full. Look for log lines from GCInspector. There is a bug that prevents flushing memtable when it has only full key delete mutations, see https://issues.apache.org/jira/browse/CASSANDRA-3741 For me it happened when we've started to move to new schema, so that old column families started to receive delete only operations. An indications is when GCInspector can't flush anything but system keyspace. 21.03.12 17:29, A J �������(��): > I have increased index_interval. Will let you know if I see a difference. > > > My theory is that memtables are not getting flushed. If I manually > flush them, the heap consumption goes down drastically. > > I think when memtable_total_space_in_mb is exceeded not enough > memtables are getting flushed. There are 5000 memtables (one for each > CF) but each memtable in itself is small. So flushing of one or two > memtable by Cassandra is not helping. > > Question: How many memtables are flushed when > memtable_total_space_in_mb is exceeded ? Any way to flush all > memtables when the threshold is reached ? > > Thanks. > > On Wed, Mar 21, 2012 at 8:56 AM, Vitalii Tymchyshyn wrote: >> Hello. >> >> There is also a primary row index. It's space can be controlled with >> index_interval setting. Don't know if you can look for it's memory usage >> somewhere. If I where you, I'd take jmap tool and examine heap histogram >> first, heap dump second. >> >> Best regards, Vitalii Tymchyshyn >> >> 20.03.12 18:12, A J �������(��): >> >>> I have both row cache and column cache disabled for all my CFs. >>> >>> cfstats says "Bloom Filter Space Used: 1760" per CF. Assuming it is in >>> bytes, it is total of about 9MB of bloom filter size for 5K CFs; which >>> is not a lot. >>> >>> >>> On Tue, Mar 20, 2012 at 11:09 AM, Vitalii Tymchyshyn >>> wrote: >>>> Hello. >>>> >>>> From my experience it's unwise to make many column families for same >>>> keys >>>> because you will have bloom filters and row indexes multiplied. If you >>>> have >>>> 5000, you should expect your heap requirements multiplied by same factor. >>>> Also check your cache sizes. Default AFAIR is 100000 keys per column >>>> family. >>>> >>>> 20.03.12 16:05, A J �������(��): >>>> >>>>> ok, the last thread says that 1.0+ onwards, thousands of CFs should >>>>> not be a problem. >>>>> >>>>> But I am finding that all the allocated heap memory is getting consumed. >>>>> I started with 8GB heap and then on reading >>>>> >>>>> >>>>> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management >>>>> realized that minimum of 1MB per memtable is used by the per-memtable >>>>> arena allocator. >>>>> So with 5K CFs, 5GB will be used just by arena allocators. >>>>> >>>>> But even on increasing the heap to 16GB, am finding that all the heap >>>>> is getting consumed. Is there a different formula for heap calculation >>>>> when you have thousands of CFs ? >>>>> Any other configuration that I need to change ? >>>>> >>>>> Thanks. >>>>> >>>>> On Mon, Mar 19, 2012 at 10:35 AM, Alain RODRIGUEZ >>>>> wrote: >>>>>> This subject was already discussed, this may help you : >>>>>> >>>>>> >>>>>> http://markmail.org/message/6dybhww56bxvufzf#query:+page:1+mid:6dybhww56bxvufzf+state:results >>>>>> >>>>>> If you still got questions after reading this thread or some others >>>>>> about >>>>>> the same topic, do not hesitate asking again, >>>>>> >>>>>> Alain >>>>>> >>>>>> >>>>>> 2012/3/19 A J >>>>>>> How many Column Families are one too many for Cassandra ? >>>>>>> I created a db with 5000 CFs (I can go into the reasons later) but the >>>>>>> latency seems to be very erratic now. Not sure if it is because of the >>>>>>> number of CFs. >>>>>>> >>>>>>> Thanks. >>>>>>