Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DBDD9670D for ; Sun, 17 Jul 2011 20:32:44 +0000 (UTC) Received: (qmail 25270 invoked by uid 500); 17 Jul 2011 20:32:43 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 25001 invoked by uid 500); 17 Jul 2011 20:32:42 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 24993 invoked by uid 99); 17 Jul 2011 20:32:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Jul 2011 20:32:41 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 74.125.82.44 as permitted sender) Received: from [74.125.82.44] (HELO mail-ww0-f44.google.com) (74.125.82.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Jul 2011 20:32:37 +0000 Received: by wwe5 with SMTP id 5so2220067wwe.25 for ; Sun, 17 Jul 2011 13:32:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=YoT9Tp+W7aPRsF875QuL/T87itS0YYWeqH4epnxIjQk=; b=cMjpq2BOwA8UPIkVaI5ureu2RIzswcddOpb+IMUhvUXljVZbsrNi7gswzxgHrkr8Ev MlBBvp2uJ/fm7f1qU+xpVEDZdfLR8vJUxb1sx4rgD58TYljR5g4xeI0T+u72+hhRcioG y7kBCW9mBt8bZyzgpODulh42cYzSPJUo2t2C4= Received: by 10.216.176.76 with SMTP id a54mr2199928wem.112.1310934736062; Sun, 17 Jul 2011 13:32:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.52.132 with HTTP; Sun, 17 Jul 2011 13:31:56 -0700 (PDT) In-Reply-To: References:

From: Jonathan Ellis Date: Sun, 17 Jul 2011 15:31:56 -0500 Message-ID: Subject: Re: Cassandra OOM on repair. To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Can't think of any. On Sun, Jul 17, 2011 at 1:27 PM, Andrey Stepachev wrote: > Looks like problem in code: > =A0 =A0 public IndexSummary(long expectedKeys) > =A0 =A0 { > =A0 =A0 =A0 =A0 long expectedEntries =3D expectedKeys / > DatabaseDescriptor.getIndexInterval(); > =A0 =A0 =A0 =A0 if (expectedEntries > Integer.MAX_VALUE) > =A0 =A0 =A0 =A0 =A0 =A0 // TODO: that's a _lot_ of keys, or a very low in= terval > =A0 =A0 =A0 =A0 =A0 =A0 throw new RuntimeException("Cannot use index_inte= rval of " + > DatabaseDescriptor.getIndexInterval() + " with " + expectedKeys + " > (expected) keys."); > =A0 =A0 =A0 =A0 indexPositions =3D new ArrayList((int)expect= edEntries); > =A0 =A0 } > I have too many keys, and too small index interval. > To fix this, I can: > 1) reduce number of keys - rewrite app and sacrifice balance > 2) increase index_interval - hurt another column families > A question: > Are there any drawbacks for using different indexInterval for column > families > in keyspace? (suppose I'll write a patch) > 2011/7/15 Andrey Stepachev >> >> Looks like key indexes eat all memory: >> http://paste.kde.org/97213/ >> >> 2011/7/15 Andrey Stepachev >>> >>> UPDATE: >>> I found, that >>> a) with min10G cassandra survive. >>> b) I have ~1000 sstables >>> c) CompactionManager uses PrecompactedRows instead of=A0LazilyCompacted= Row >>> So, I have a question: >>> a) if row is bigger then 64mb before compaction, why it compacted in >>> memory >>> b) if it smaller, what eats so much memory? >>> 2011/7/15 Andrey Stepachev >>>> >>>> Hi all. >>>> Cassandra constantly OOM on repair or compaction. Increasing memory >>>> doesn't help=A0(6G) >>>> I can give more, but I think that this is not a regular situation. >>>> Cluster has 4 nodes. RF=3D3. >>>> Cassandra version 0.8.1 >>>> Ring looks like this: >>>> =A0Address =A0 =A0 =A0 =A0 DC =A0 =A0 =A0 =A0 =A0Rack =A0 =A0 =A0 =A0S= tatus State =A0 Load >>>> =A0Owns =A0 =A0Token >>>> >>>> =A0 =A0 =A0 =A0127605887595351923798765477786913079296 >>>> xxx.xxx.xxx.66 =A0datacenter1 rack1 =A0 =A0 =A0 Up =A0 =A0 Normal =A01= 76.96 GB >>>> 25.00% =A00 >>>> xxx.xxx.xxx.69 =A0datacenter1 rack1 =A0 =A0 =A0 Up =A0 =A0 Normal =A01= 78.19 GB >>>> 25.00% =A042535295865117307932921825928971026432 >>>> xxx.xxx.xxx.67 =A0datacenter1 rack1 =A0 =A0 =A0 Up =A0 =A0 Normal =A01= 78.26 GB >>>> 25.00% =A085070591730234615865843651857942052864 >>>> xxx.xxx.xxx.68 =A0datacenter1 rack1 =A0 =A0 =A0 Up =A0 =A0 Normal =A01= 75.2 GB >>>> =A025.00% =A0127605887595351923798765477786913079296 >>>> About schema: >>>> I have big rows (>100k, up to several millions). But as I know, it is >>>> normal for cassandra. >>>> All things work relatively good, until I start long running >>>> pre-production tests. I load >>>> data and after a while (~4hours) cluster begin timeout and them some >>>> nodes die with OOM. >>>> My app retries to send, so after short period all nodes becomes down. >>>> Very nasty. >>>> But now, I can OOM nodes by simple call nodetool repair. >>>> In logs=A0http://paste.kde.org/96811/=A0it is clear, how heap rocketju= mp to >>>> upper limit. >>>> cfstats shows:=A0http://paste.kde.org/96817/ >>>> config is:=A0http://paste.kde.org/96823/ >>>> A question is: does anybody knows, what this means. Why cassandra trie= s >>>> to load >>>> something big into memory at once? >>>> A. >> > > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com