Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 96141C03D for ; Fri, 8 Jun 2012 16:06:49 +0000 (UTC) Received: (qmail 27921 invoked by uid 500); 8 Jun 2012 16:06:46 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 27894 invoked by uid 500); 8 Jun 2012 16:06:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 27876 invoked by uid 99); 8 Jun 2012 16:06:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Jun 2012 16:06:46 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of javier.a.sotelo@gmail.com designates 209.85.216.172 as permitted sender) Received: from [209.85.216.172] (HELO mail-qc0-f172.google.com) (209.85.216.172) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Jun 2012 16:06:41 +0000 Received: by qcsq13 with SMTP id q13so1108647qcs.31 for ; Fri, 08 Jun 2012 09:06:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=ysAJXhlNvvRxgfWwO0qAA1FdcBlM2H/t6WRPgB+ykYE=; b=gsmMPlDI6AByLYr0MNAmvsGJBrAJD1MxSYqylnuz+K9rV1L0TEECTdiHPwEP7Ty+AQ QNSzdC6XJqfWRLwPDhR04D/+Bx96auVLtNL07u/sY97NgNM3l56cnj3WN3Dh2Tv0F+Gw qDvMAh48aWLeGz3pVYUcxrIrQsThJJrnNGS4hLmXNYYqK+05aah8MgbD02TdioCmBKQV ViyGrEHUUgSa+xMfvqouYlh8SkequVKiQTTIYOHzCEfLnlrXA87GgbewEsjbHTIR261X 8clndyzyr92FxuOH2V1ybDqMFtM6nJgOxTIpDqMUVAJss5chqStFZef9cHEiKQanzGnf BTQA== MIME-Version: 1.0 Received: by 10.224.70.194 with SMTP id e2mr7600476qaj.91.1339171580847; Fri, 08 Jun 2012 09:06:20 -0700 (PDT) Received: by 10.229.182.140 with HTTP; Fri, 8 Jun 2012 09:06:20 -0700 (PDT) In-Reply-To: References: <8918076D-E69C-47D5-96DE-DD55A673B919@thelastpickle.com> Date: Fri, 8 Jun 2012 09:06:20 -0700 Message-ID: Subject: Re: Cassandra 1.1.1 Fails to Start From: Javier Sotelo To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=bcaec51dd7ef5ea79204c1f82e61 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec51dd7ef5ea79204c1f82e61 Content-Type: text/plain; charset=ISO-8859-1 Different node same hardware now gets the stack overflow error but I found the part of the stack trace that is more interesting: at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at com.google.common.collect.Iterators.size(Iterators.java:129) at com.google.common.collect.Sets$3.size(Sets.java:670) at com.google.common.collect.Iterables.size(Iterables.java:80) at org.apache.cassandra.db.DataTracker.buildIntervalTree(DataTracker.java:557) at org.apache.cassandra.db.compaction.CompactionController.(CompactionController.java:79) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:105) at org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50) Is it time for a JIRA ticket? On Thu, Jun 7, 2012 at 7:03 AM, Javier Sotelo wrote: > nodetool ring showed 34.89GB load. Upgrading from 1.1.0. One small > keyspace with no compression, about 250MB. The rest taken by the second > keyspace with leveled compaction and snappy compressed. > > The blade is an Intel(R) Xeon(R) CPU E5620 @ 2.40GHz with 6GB of RAM. > > > On Thu, Jun 7, 2012 at 2:52 AM, aaron morton wrote: > >> How much data do you have on the node ? >> Was this a previously running system that was upgraded ? >> >> > with disk_access_mode mmap_index_only and mmap I see OOM map failed >> error on SSTableBatchOpen thread >> Do you have the stack trace from the log ? >> >> > ERROR [CompactionExecutor:6] 2012-06-06 20:24:19,772 >> AbstractCassandraDaemon.java (line 134) Exception in thread >> Thread[CompactionExecutor:6,1,main] >> > java.lang.StackOverflowError >> > at com.google.common.collect.Sets$1.iterator(Sets.java:578) >> > at com.google.common.collect.Sets$1.iterator(Sets.java:578) >> > at com.google.common.collect.Sets$1.iterator(Sets.java:578) >> Was there more to this stack trace ? >> What were the log messages before this error ? >> >> >> > INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (line >> 122) Heap size: 1525415936/1525415936 >> The JVM only has 1.5 G of ram, this is at the lower limit. If you have >> some data to load I would not be surprised if it failed to start. >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 7/06/2012, at 8:41 AM, Javier Sotelo wrote: >> >> > Hi All, >> > >> > On SuSe Linux blade with 6GB of RAM. >> > >> > with disk_access_mode mmap_index_only and mmap I see OOM map failed >> error on SSTableBatchOpen thread. cat /proc//maps shows a peak of >> 53521 right before it dies. vm.max_map_count = 1966080 and >> /proc//limits shows unlimited locked memory. >> > >> > with disk_access_mode standard, the node does start up but I see the >> repeated error: >> > ERROR [CompactionExecutor:6] 2012-06-06 20:24:19,772 >> AbstractCassandraDaemon.java (line 134) Exception in thread >> Thread[CompactionExecutor:6,1,main] >> > java.lang.StackOverflowError >> > at com.google.common.collect.Sets$1.iterator(Sets.java:578) >> > at com.google.common.collect.Sets$1.iterator(Sets.java:578) >> > at com.google.common.collect.Sets$1.iterator(Sets.java:578) >> > ... >> > >> > I'm not sure the second error is related to the first. I prefer to run >> with full mmap but I have run out of ideas. Is there anything else I can do >> to debug this? >> > >> > Here's startup settings from debug log: >> > INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (line >> 121) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_31 >> > INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (line >> 122) Heap size: 1525415936/1525415936 >> > ... >> > INFO [main] 2012-06-06 20:17:10,946 CLibrary.java (line 111) JNA >> mlockall successful >> > ... >> > INFO [main] 2012-06-06 20:17:11,055 DatabaseDescriptor.java (line 191) >> DiskAccessMode is standard, indexAccessMode is standard >> > INFO [main] 2012-06-06 20:17:11,213 DatabaseDescriptor.java (line 247) >> Global memtable threshold is enabled at 484MB >> > INFO [main] 2012-06-06 20:17:11,499 CacheService.java (line 96) >> Initializing key cache with capacity of 72 MBs. >> > INFO [main] 2012-06-06 20:17:11,509 CacheService.java (line 107) >> Scheduling key cache save to each 14400 seconds (going to save all keys). >> > INFO [main] 2012-06-06 20:17:11,510 CacheService.java (line 121) >> Initializing row cache with capacity of 0 MBs and provider >> org.apache.cassandra.cache.SerializingCacheProvider >> > INFO [main] 2012-06-06 20:17:11,513 CacheService.java (line 133) >> Scheduling row cache save to each 0 seconds (going to save all keys). >> > >> > Thanks In Advance, >> > Javier >> >> > --bcaec51dd7ef5ea79204c1f82e61 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Different node same hardware now gets the stack overflow error but I found = the part of the stack trace that is more interesting:

=A0 =A0 =A0 =A0 at com.google.common.collect.Iterators$5.= hasNext(Iterators.java:517)
=A0 =A0 =A0 =A0 at com.google.common.collect.Iterators$3.hasNext(Itera= tors.java:114)
=A0 =A0 =A0 =A0 at com.google.common.collect.Itera= tors$5.hasNext(Iterators.java:517)
=A0 =A0 =A0 =A0 at com.google.= common.collect.Iterators$3.hasNext(Iterators.java:114)
=A0 =A0 =A0 =A0 at com.google.common.collect.Iterators$7.computeNext(I= terators.java:614)
=A0 =A0 =A0 =A0 at com.google.common.collect.A= bstractIterator.tryToComputeNext(AbstractIterator.java:140)
=A0 = =A0 =A0 =A0 at com.google.common.collect.AbstractIterator.hasNext(AbstractI= terator.java:135)
=A0 =A0 =A0 =A0 at com.google.common.collect.Iterators.size(Iterators.= java:129)
=A0 =A0 =A0 =A0 at com.google.common.collect.Sets$3.siz= e(Sets.java:670)
=A0 =A0 =A0 =A0 at com.google.common.collect.Ite= rables.size(Iterables.java:80)
=A0 =A0 =A0 =A0 at org.apache.cassandra.db.DataTracker.buildIntervalTr= ee(DataTracker.java:557)
=A0 =A0 =A0 =A0 at org.apache.cassandra.= db.compaction.CompactionController.<init>(CompactionController.java:7= 9)
=A0 =A0 =A0 =A0 at org.apache.cassandra.db.compaction.CompactionTask.execut= e(CompactionTask.java:105)
=A0 =A0 =A0 =A0 at org.apache.cassandr= a.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50= )

Is it time for a JIRA ticket?


On Thu, Jun 7, 2012 at 7:03 AM, Javier Sotelo <javier.a.sotelo@gmail.com> wrote:
nodetool ring showed 34.89GB load. Upgrading= from 1.1.0. One small keyspace with no compression, about 250MB. The rest = taken by the second keyspace with leveled compaction and snappy compressed.=

The blade is an Intel(R) Xeon(R) CPU E5620 @ 2.40GHz with 6G= B of RAM.


On Thu, Jun 7, 2012 at 2:52 AM, aaron m= orton <aaron@thelastpickle.com> wrote:
How much data do you have on the node ?
Was this a previously running system that was upgraded ?

> with disk_access_mode mmap_index_only and mmap I see OOM map failed er= ror on SSTableBatchOpen thread
Do you have the stack trace from the log ?

> ERROR [CompactionExecutor:6] 2012-06-06 20:24:19,772 AbstractCassandra= Daemon.java (line 134) Exception in thread Thread[CompactionExecutor:6,1,ma= in]
> java.lang.StackOverflowError
> =A0 =A0 =A0 =A0 at com.google.common.collect.Sets$1.iterator(Sets.java= :578)
> =A0 =A0 =A0 =A0 at com.google.common.collect.Sets$1.iterator(Sets.java= :578)
> =A0 =A0 =A0 =A0 at com.google.common.collect.Sets$1.iterator(Sets.java= :578)
Was there more to this stack trace ?
What were the log messages before this error ?


> =A0INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (l= ine 122) Heap size: 1525415936/1525415936
The JVM only has 1.5 G of ram, this is at the lower limit. If you hav= e some data to load I would not be surprised if it failed to start.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thela= stpickle.com

On 7/06/2012, at 8:41 AM, Javier Sotelo wrote:

> Hi All,
>
> On SuSe Linux blade with 6GB of RAM.
>
> with disk_access_mode mmap_index_only and mmap I see OOM map failed er= ror on SSTableBatchOpen thread. cat /proc/<pid>/maps shows a peak of = 53521 right before it dies. vm.max_map_count =3D 1966080 and /proc/<pid&= gt;/limits shows unlimited locked memory.
>
> with disk_access_mode standard, the node does start up but I see the r= epeated error:
> ERROR [CompactionExecutor:6] 2012-06-06 20:24:19,772 AbstractCassandra= Daemon.java (line 134) Exception in thread Thread[CompactionExecutor:6,1,ma= in]
> java.lang.StackOverflowError
> =A0 =A0 =A0 =A0 at com.google.common.collect.Sets$1.iterator(Sets.java= :578)
> =A0 =A0 =A0 =A0 at com.google.common.collect.Sets$1.iterator(Sets.java= :578)
> =A0 =A0 =A0 =A0 at com.google.common.collect.Sets$1.iterator(Sets.java= :578)
> ...
>
> I'm not sure the second error is related to the first. I prefer to= run with full mmap but I have run out of ideas. Is there anything else I c= an do to debug this?
>
> Here's startup settings from debug log:
> =A0INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (l= ine 121) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_31
> =A0INFO [main] 2012-06-06 20:17:10,267 AbstractCassandraDaemon.java (l= ine 122) Heap size: 1525415936/1525415936
> =A0...
> =A0INFO [main] 2012-06-06 20:17:10,946 CLibrary.java (line 111) JNA ml= ockall successful
> =A0...
> =A0INFO [main] 2012-06-06 20:17:11,055 DatabaseDescriptor.java (line 1= 91) DiskAccessMode is standard, indexAccessMode is standard
> =A0INFO [main] 2012-06-06 20:17:11,213 DatabaseDescriptor.java (line 2= 47) Global memtable threshold is enabled at 484MB
> =A0INFO [main] 2012-06-06 20:17:11,499 CacheService.java (line 96) Ini= tializing key cache with capacity of 72 MBs.
> =A0INFO [main] 2012-06-06 20:17:11,509 CacheService.java (line 107) Sc= heduling key cache save to each 14400 seconds (going to save all keys).
> =A0INFO [main] 2012-06-06 20:17:11,510 CacheService.java (line 121) In= itializing row cache with capacity of 0 MBs and provider org.apache.cassand= ra.cache.SerializingCacheProvider
> =A0INFO [main] 2012-06-06 20:17:11,513 CacheService.java (line 133) Sc= heduling row cache save to each 0 seconds (going to save all keys).
>
> Thanks In Advance,
> Javier



--bcaec51dd7ef5ea79204c1f82e61--