Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 11D3A994D for ; Wed, 13 Mar 2013 03:18:30 +0000 (UTC) Received: (qmail 46632 invoked by uid 500); 13 Mar 2013 03:18:27 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 46595 invoked by uid 500); 13 Mar 2013 03:18:26 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 46566 invoked by uid 99); 13 Mar 2013 03:18:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Mar 2013 03:18:25 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jinjian.1@gmail.com designates 209.85.220.172 as permitted sender) Received: from [209.85.220.172] (HELO mail-vc0-f172.google.com) (209.85.220.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Mar 2013 03:18:21 +0000 Received: by mail-vc0-f172.google.com with SMTP id l6so301744vcl.3 for ; Tue, 12 Mar 2013 20:18:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=ztMmlMH2IYhXfOxXdQcAJ8snTO46rTC9OaJFSy9VgLA=; b=wp5o1UmnMv/0ZB8SIXUI6TCDTBBkAwn68EeihFzEimICHSp/SOFkJhP5SyYGzRd9GP D6qBqgPnjpT1bdVTS/Fd+XeBObQ6k1QEjMoMxS0JCgj6RrcEiKfnWhblRUIWquvVKEf5 IOll+x1TvniwFFrKcwDChxLPGO4NUu9qFadMoJPpbXcjsUKsserCn3JZlxOULUyZB+Yl +4RBo85pcEeZdEOyQDWL5ax0HqZhermzsU9/NmKM7rGd1o4nhKFDNDESpMD4uDegNKDa gQi/rkCKr0W5k8q0vg4h8zDx3dGf5EYKnwEVJrukaKaBx4aafNKhNYgLH+h4+3MiSOMw YA+g== X-Received: by 10.52.33.167 with SMTP id s7mr6470255vdi.52.1363144680943; Tue, 12 Mar 2013 20:18:00 -0700 (PDT) MIME-Version: 1.0 Received: by 10.58.244.8 with HTTP; Tue, 12 Mar 2013 20:17:40 -0700 (PDT) In-Reply-To: <10F0181D-AAE9-4073-8B9A-EE6C51CBB02E@thelastpickle.com> References: <10F0181D-AAE9-4073-8B9A-EE6C51CBB02E@thelastpickle.com> From: =?UTF-8?B?6YeR5YmR?= Date: Wed, 13 Mar 2013 11:17:40 +0800 Message-ID: Subject: Re: Cassandra OOM, many deletedColumn To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf307812d67c3ec304d7c5da41 X-Virus-Checked: Checked by ClamAV on apache.org --20cf307812d67c3ec304d7c5da41 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks for you reply. we will try both of your recommentation. The OS memory is 8G=EF=BC=8C For JVM Heap it is 2G, DeletedColumn used 1.4G which = are rooted from readStage thread. Do you think we need increase the size of JVM Heap? Configuration for the index columnFamily is create column family purge with column_type =3D 'Standard' and comparator =3D 'UTF8Type' and default_validation_class =3D 'BytesType' and key_validation_class =3D 'UTF8Type' and read_repair_chance =3D 1.0 and gc_grace =3D 1800 and min_compaction_threshold =3D 4 and max_compaction_threshold =3D 32 and replicate_on_write =3D true and compaction_strategy =3D 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'; Best Regards! Jian Jin 2013/3/9 aaron morton > You need to provide some details of the machine and the JVM configuration= . > But lets say you need to have 4Gb to 8GB for the JVM heap. > > If you have many deleted columns I would say you have a *lot* of garbage > in each row. Consider reducing the gc_grace seconds so the columns are > purged more frequently, not however that columns are only purged when all > fragments of the row are part of the minor compaction. > > If you have a mixed write / delete work load consider using the Levelled > compaction strategy > http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 6/03/2013, at 10:37 PM, Jason Wee wrote: > > hmm.. did you managed to take a look using nodetool tpstats? That may giv= e > you indication further.. > > Jason > > > On Thu, Mar 7, 2013 at 1:56 PM, =E9=87=91=E5=89=91 = wrote: > >> Hi, >> >> My version is 1.1.7 >> >> Our use case is : we have a index columnfamily to record how many >> resource is stored for a user. The number might vary from tens to millio= ns. >> >> We provide a feature to let user to delete resource according prefix. >> >> >> we found some cassandra will OOM after some period. The cluster is a >> kind of cross-datacenter ring. >> >> 1. Exception in cassandra log: >> >> ERROR [Thread-5810] 2013-02-04 05:38:13,882 AbstractCassandraDaemon.java >> (line 135) Exception in thread Thread[Thread-5810,5,main] >> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has >> shut down >> at >> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedE= xecution(DebuggableThreadPoolExecutor.java:60) >> >> at >> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:7= 67) >> at >> java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(ThreadPo= olExecutor.java:758) >> >> at >> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:= 655) >> >> at >> org.apache.cassandra.net.MessagingService.receive(MessagingService.java:= 581) >> >> at >> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTc= pConnection.java:155) >> >> at >> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection= .java:113) >> >> ERROR [Thread-5819] 2013-02-04 05:38:13,888 AbstractCassandraDaemon.java >> (line 135) Exception in thread Thread[Thread-5819,5,main] >> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has >> shut down >> at >> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedE= xecution(DebuggableThreadPoolExecutor.java:60) >> >> at >> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:7= 67) >> at >> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:= 658) >> >> at >> org.apache.cassandra.net.MessagingService.receive(MessagingService.java:= 581) >> >> at >> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTc= pConnection.java:155) >> >> at >> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection= .java:113) >> >> ERROR [Thread-36] 2013-02-04 05:38:13,898 AbstractCassandraDaemon.java >> (line 135) Exception in thread Thread[Thread-36,5,main] >> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has >> shut down >> at >> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedE= xecution(DebuggableThreadPoolExecutor.java:60) >> >> at >> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:7= 67) >> at >> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:= 658) >> >> at >> org.apache.cassandra.net.MessagingService.receive(MessagingService.java:= 581) >> >> at >> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTc= pConnection.java:155) >> >> at >> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection= .java:113) >> >> ERROR [Thread-3990] 2013-02-04 05:38:13,902 AbstractCassandraDaemon.java >> (line 135) Exception in thread Thread[Thread-3990,5,main] >> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has >> shut down >> at >> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedE= xecution(DebuggableThreadPoolExecutor.java:60) >> >> at >> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:7= 67) >> at >> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:= 658) >> >> at >> org.apache.cassandra.net.MessagingService.receive(MessagingService.java:= 581) >> >> at >> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTc= pConnection.java:155) >> >> at >> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection= .java:113) >> >> ERROR [ACCEPT-/10.139.50.62] AbstractCassandraDaemon.java (line 135) >> Exception in thread Thread[ACCEPT-/10.139.50.62,5,main] >> java.lang.RuntimeException: java.nio.channels.ClosedChannelException >> at >> org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingServ= ice.java:710) >> >> Caused by: java.nio.channels.ClosedChannelException >> at >> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:1= 37) >> at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84) >> at >> org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingServ= ice.java:699) >> >> INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.jav= a >> (line 374) Timed out replaying hints to /23.20.84.240; aborting further >> deliveries >> INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.jav= a >> (line 392) Finished hinted handoff of 0 rows to endpoint >> INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.jav= a >> (line 296) Started hinted handoff for token: 3 >> >> 2. From heap dump, there are many deletedColumn found, rooted from threa= d >> readStage. >> >> >> Pls help: where might be the problem? >> >> Best Regards! >> >> Jian Jin >> > > > --20cf307812d67c3ec304d7c5da41 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks for you reply. we will try both of your recomm= entation. The OS memory is 8G=EF=BC=8C For JVM Heap it is 2G, DeletedColumn= used 1.4G which are rooted from readStage thread. Do you think we need inc= rease the size of JVM Heap?=C2=A0

=C2=A0Configuration for the index columnFamily is

create = column family purge
=C2=A0=C2=A0with column_type =3D 'Standard'<= br>=C2=A0=C2=A0and comparator =3D 'UTF8Type'
=C2=A0=C2=A0and def= ault_validation_class =3D 'BytesType'
=C2=A0=C2=A0and key_validation_class =3D 'UTF8Type'
=C2=A0=C2=A0= and read_repair_chance =3D 1.0
=C2=A0=C2=A0and gc_grace =3D 1800
=C2= =A0=C2=A0and min_compaction_threshold =3D 4
=C2=A0=C2=A0and max_compacti= on_threshold =3D 32
=C2=A0=C2=A0and replicate_on_write =3D true
=C2=A0=C2=A0and compaction_strategy =3D 'org.apache.cassandra.db.compac= tion.SizeTieredCompactionStrategy';


Best Regards!

Jian Jin



2013/3/9 aaron morton = <aaron@thel= astpickle.com>
You need to provide some details of the= machine and the JVM configuration. But lets say you need to have 4Gb to 8G= B for the JVM heap.=C2=A0

If you have many deleted colum= ns I would say you have a *lot* of garbage in each row. Consider reducing t= he gc_grace seconds so the columns are purged more frequently, not however = that columns are only purged when all fragments of the row are part of the = minor compaction.=C2=A0

If you have a mixed write / delete work load consider u= sing the Levelled compaction strategy=C2=A0http:= //www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra

Cheers

-----------------
Aaron Morton
Freelance Cassandra= Developer
New Zealand


On 6/03/2013, at 10:37 PM, Jason Wee <peichieh@gmail.com> wrote:
=
hmm.. did you managed to tak= e a look using nodetool tpstats? That may give you indication further..
Jason


On Thu, Mar 7, 2013 at 1:56 PM, =E9=87=91=E5=89=91 <j= injian.1@gmail.com> wrote:
Hi,
=
My version is=C2=A0 1.1.7

Our use case is : we have = a index columnfamily to record how many resource is stored for a user. The = number might vary from tens to millions.

We provide a feature to let user to delete resource according pre= fix.


=C2=A0we found some cassandra will OOM after som= e period. The cluster is a kind of cross-datacenter ring.

1. Exception in cassandra log:

ERROR [Thread-5810] 2013-02-04 05:38:= 13,882 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5810,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shu= t down
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejected= Execution(DebuggableThreadPoolExecutor.java:60)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:= 767)
at java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(ThreadP= oolExecutor.java:758)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java= :655)
at org.apache.cassandra.net.MessagingService.receive(MessagingService.java= :581)
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingT= cpConnection.java:155)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnectio= n.java:113)
ERROR [Thread-5819] 2013-02-04 05:38:13,888 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5819,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shu= t down
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejected= Execution(DebuggableThreadPoolExecutor.java:60)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:= 767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java= :658)
at org.apache.cassandra.net.MessagingService.receive(MessagingService.java= :581)
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingT= cpConnection.java:155)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnectio= n.java:113)
ERROR [Thread-36] 2013-02-04 05:38:13,898 AbstractCassandraDaemon.java (lin= e 135) Exception in thread Thread[Thread-36,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shu= t down
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejected= Execution(DebuggableThreadPoolExecutor.java:60)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:= 767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java= :658)
at org.apache.cassandra.net.MessagingService.receive(MessagingService.java= :581)
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingT= cpConnection.java:155)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnectio= n.java:113)
ERROR [Thread-3990] 2013-02-04 05:38:13,902 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-3990,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shu= t down
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejected= Execution(DebuggableThreadPoolExecutor.java:60)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:= 767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java= :658)
at org.apache.cassandra.net.MessagingService.receive(MessagingService.java= :581)
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingT= cpConnection.java:155)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnectio= n.java:113)
ERROR [ACCEPT-/10.139.50= .62] AbstractCassandraDaemon.java (line 135) Exception in thread=20 Thread[ACCEPT-/10.139.50= .62,5,main]
java.lang.RuntimeException: java.nio.channels.ClosedChannelException
at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingSer= vice.java:710)
Caused by: java.nio.channels.ClosedChannelException
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:= 137)
at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingSer= vice.java:699)
=C2=A0INFO [HintedHandoff:1] 2013-02-04 05:38:24,971=20 HintedHandOffManager.java (line 374) Timed out replaying hints to=20 /23.20.84.240; abort= ing further deliveries
=C2=A0INFO [HintedHandoff:1] 2013-02-04 05:38:24,971=20 HintedHandOffManager.java (line 392) Finished hinted handoff of 0 rows=20 to endpoint
=C2=A0INFO [HintedHandoff:1] 2013-02-04 05:38:24,971=20 HintedHandOffManager.java (line 296) Started hinted handoff for token: 3

2. From heap dump, there are many deletedColumn found, = rooted from thread readStage.


Pls help: where might b= e the problem?

Best Regards!

Jian Jin




--20cf307812d67c3ec304d7c5da41--