Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4B36FCBB1 for ; Fri, 21 Jun 2013 17:49:54 +0000 (UTC) Received: (qmail 25619 invoked by uid 500); 21 Jun 2013 17:49:50 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 25348 invoked by uid 500); 21 Jun 2013 17:49:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 25340 invoked by uid 99); 21 Jun 2013 17:49:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jun 2013 17:49:44 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mohammed@glassbeam.com designates 64.95.72.252 as permitted sender) Received: from [64.95.72.252] (HELO mxout.myoutlookonline.com) (64.95.72.252) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jun 2013 17:49:38 +0000 Received: from mxout.myoutlookonline.com (localhost [127.0.0.1]) by mxout.myoutlookonline.com (Postfix) with ESMTP id EB23F9C68A1 for ; Fri, 21 Jun 2013 13:49:17 -0400 (EDT) X-Virus-Scanned: by SpamTitan at mail.lan Received: from S11HUB003.sh11.lan (unknown [10.110.2.1]) by mxout.myoutlookonline.com (Postfix) with ESMTP id 038479C68DF for ; Fri, 21 Jun 2013 13:49:15 -0400 (EDT) Received: from S11MAILD002N1.sh11.lan ([fe80::180:a139:8c5:1c4f]) by S11HUB003.sh11.lan ([::1]) with mapi id 14.02.0247.003; Fri, 21 Jun 2013 13:49:15 -0400 From: Mohammed Guller To: "user@cassandra.apache.org" Subject: Cassandra terminates with OutOfMemory (OOM) error Thread-Topic: Cassandra terminates with OutOfMemory (OOM) error Thread-Index: Ac5upoKpXLf4ZhFYSeCoo+LA5eivUQ== Date: Fri, 21 Jun 2013 17:49:14 +0000 Message-ID: <045D8FD556C73347A47F956EE65F82200360A5B7@S11MAILD002N1.sh11.lan> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [65.98.210.242] x-exclaimer-md-config: f0d3a6b8-613a-409d-aea5-530a0e693ae0 Content-Type: multipart/alternative; boundary="_000_045D8FD556C73347A47F956EE65F82200360A5B7S11MAILD002N1sh_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_045D8FD556C73347A47F956EE65F82200360A5B7S11MAILD002N1sh_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable We have a 3-node cassandra cluster on AWS. These nodes are running cassandr= a 1.2.2 and have 8GB memory. We didn't change any of the default heap or GC= settings. So each node is allocating 1.8GB of heap space. The rows are wid= e; each row stores around 260,000 columns. We are reading the data using As= tyanax. If our application tries to read 80,000 columns each from 10 or mor= e rows at the same time, some of the nodes run out of heap space and termin= ate with OOM error. Here is the error message: java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(A= bstractCompositeType.java:50) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithSho= rtLength(AbstractCompositeType.java:60) at org.apache.cassandra.db.marshal.AbstractCompositeType.split(Abst= ractCompositeType.java:126) at org.apache.cassandra.db.filter.ColumnCounter$GroupByPrefix.count= (ColumnCounter.java:96) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedCo= lumns(SliceQueryFilter.java:164) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryF= ilter.java:136) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(Que= ryFilter.java:84) at org.apache.cassandra.db.CollationController.collectAllData(Colla= tionController.java:294) at org.apache.cassandra.db.CollationController.getTopLevelColumns(C= ollationController.java:65) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(Col= umnFamilyStore.java:1363) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(Column= FamilyStore.java:1220) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(Column= FamilyStore.java:1132) at org.apache.cassandra.db.Table.getRow(Table.java:355) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromRea= dCommand.java:70) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMa= yThrow(StorageProxy.java:1052) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(= StorageProxy.java:1578) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec= utor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:603) at java.lang.Thread.run(Thread.java:722) ERROR 02:14:05,351 Exception in thread Thread[Thrift:6,5,main] java.lang.OutOfMemoryError: Java heap space at java.lang.Long.toString(Long.java:269) at java.lang.Long.toString(Long.java:764) at org.apache.cassandra.dht.Murmur3Partitioner$1.toString(Murmur3Pa= rtitioner.java:171) at org.apache.cassandra.service.StorageService.describeRing(Storage= Service.java:1068) at org.apache.cassandra.thrift.CassandraServer.describe_ring(Cassan= draServer.java:1192) at org.apache.cassandra.thrift.Cassandra$Processor$describe_ring.ge= tResult(Cassandra.java:3766) at org.apache.cassandra.thrift.Cassandra$Processor$describe_ring.ge= tResult(Cassandra.java:3754) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:3= 2) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProces= s.run(CustomTThreadPoolServer.java:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec= utor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:603) at java.lang.Thread.run(Thread.java:722) The data in each column is less than 50 bytes. After adding all the column = overheads (column name + metadata), it should not be more than 100 bytes. S= o reading 80,000 columns from 10 rows each means that we are reading 80,000= * 10 * 100 =3D 80 MB of data. It is large, but not large enough to fill up= the 1.8 GB heap. So I wonder why the heap is getting full. If the data req= uest is too big to fill in a reasonable amount of time, I would expect Cass= andra to return a TimeOutException instead of terminating. One easy solution is to increase the heapsize. However that means Cassandra= can still crash if someone reads 100 rows. I wonder if there some other C= assandra setting that I can tweak to prevent the OOM exception? Thanks, Mohammed --_000_045D8FD556C73347A47F956EE65F82200360A5B7S11MAILD002N1sh_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

We have a 3-node cassandra cluster on AWS. These nod= es are running cassandra 1.2.2 and have 8GB memory. We didn't change any of= the default heap or GC settings. So each node is allocating 1.8GB of heap = space. The rows are wide; each row stores around 260,000 columns. We are reading the data using Astyanax. If = our application tries to read 80,000 columns each from 10 or more rows at t= he same time, some of the nodes run out of heap space and terminate with OO= M error. Here is the error message:

 

java.lang.OutOfMemoryError: Java heap space

        at java.n= io.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)

        at org.ap= ache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeT= ype.java:50)

        at org.ap= ache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(Abstract= CompositeType.java:60)

        at org.ap= ache.cassandra.db.marshal.AbstractCompositeType.split(AbstractCompositeType= .java:126)

        at org.ap= ache.cassandra.db.filter.ColumnCounter$GroupByPrefix.count(ColumnCounter.ja= va:96)

        at org.ap= ache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryF= ilter.java:164)

        at org.ap= ache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)

        at org.ap= ache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)=

        at org.ap= ache.cassandra.db.CollationController.collectAllData(CollationController.ja= va:294)

        at org.ap= ache.cassandra.db.CollationController.getTopLevelColumns(CollationControlle= r.java:65)

        at org.ap= ache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.ja= va:1363)

        at org.ap= ache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:= 1220)

        at org.ap= ache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:= 1132)

        at org.ap= ache.cassandra.db.Table.getRow(Table.java:355)

        at org.ap= ache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)=

       at org.apa= che.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StoragePro= xy.java:1052)

        at org.ap= ache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java= :1578)

        at java.u= til.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

        at java.u= til.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

        at java.l= ang.Thread.run(Thread.java:722)

 

ERROR 02:14:05,351 Exception in thread Thread[Thrift= :6,5,main]

java.lang.OutOfMemoryError: Java heap space

        at java.l= ang.Long.toString(Long.java:269)

        at java.l= ang.Long.toString(Long.java:764)

        at org.ap= ache.cassandra.dht.Murmur3Partitioner$1.toString(Murmur3Partitioner.java:17= 1)

        at org.ap= ache.cassandra.service.StorageService.describeRing(StorageService.java:1068= )

        at org.ap= ache.cassandra.thrift.CassandraServer.describe_ring(CassandraServer.java:11= 92)

        at org.ap= ache.cassandra.thrift.Cassandra$Processor$describe_ring.getResult(Cassandra= .java:3766)

        at org.ap= ache.cassandra.thrift.Cassandra$Processor$describe_ring.getResult(Cassandra= .java:3754)

        at org.ap= ache.thrift.ProcessFunction.process(ProcessFunction.java:32)

        at org.ap= ache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)

        at org.ap= ache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThre= adPoolServer.java:199)

        at java.u= til.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

        at java.u= til.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

        at java.l= ang.Thread.run(Thread.java:722)

 

The data in each column is less than 50 bytes. After= adding all the column overheads (column name + metadata), it should no= t be more than 100 bytes. So reading 80,000 columns from 10 rows each means= that we are reading 80,000 * 10 * 100 =3D 80 MB of data. It is large, but not large enough to fill up the 1.8 GB= heap. So I wonder why the heap is getting full. If the data request is too= big to fill in a reasonable amount of time, I would expect Cassandra to re= turn a TimeOutException instead of terminating.

 

One easy solution is to increase the heapsize. Howev= er that means Cassandra can still crash if someone reads 100 rows.  I = wonder if there some other Cassandra setting that I can tweak to prevent th= e OOM exception?

 

Thanks,

Mohammed

--_000_045D8FD556C73347A47F956EE65F82200360A5B7S11MAILD002N1sh_--