Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 857FD10A32 for ; Wed, 16 Oct 2013 00:21:50 +0000 (UTC) Received: (qmail 37699 invoked by uid 500); 16 Oct 2013 00:21:47 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 37595 invoked by uid 500); 16 Oct 2013 00:21:47 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 37587 invoked by uid 99); 16 Oct 2013 00:21:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Oct 2013 00:21:47 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of abarua@247-inc.com designates 213.199.154.82 as permitted sender) Received: from [213.199.154.82] (HELO emea01-db3-obe.outbound.protection.outlook.com) (213.199.154.82) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Oct 2013 00:21:40 +0000 Received: from HKNPR03MB146.apcprd03.prod.outlook.com (10.242.101.19) by HKNPR03MB145.apcprd03.prod.outlook.com (10.242.101.15) with Microsoft SMTP Server (TLS) id 15.0.785.10; Wed, 16 Oct 2013 00:21:16 +0000 Received: from HKNPR03MB146.apcprd03.prod.outlook.com ([169.254.2.191]) by HKNPR03MB146.apcprd03.prod.outlook.com ([169.254.2.197]) with mapi id 15.00.0785.001; Wed, 16 Oct 2013 00:21:15 +0000 From: Arindam Barua To: "user@cassandra.apache.org" Subject: Heap almost full Thread-Topic: Heap almost full Thread-Index: Ac7KA6yqOlnSyN9gSGW9yuobJHk32Q== Date: Wed, 16 Oct 2013 00:21:14 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [66.170.122.253] x-forefront-prvs: 0001227049 x-forefront-antispam-report: SFV:NSPM;SFS:(479174003)(199002)(189002)(164054003)(63696002)(16236675002)(65816001)(74316001)(66066001)(81816001)(81686001)(15975445006)(81542001)(47446002)(74662001)(74876001)(31966008)(54316002)(56776001)(77982001)(59766001)(79102001)(81342001)(69226001)(19609705001)(74366001)(76786001)(15202345003)(51856001)(76576001)(80976001)(50986001)(77096001)(56816003)(76796001)(49866001)(76176001)(47976001)(74706001)(47736001)(54356001)(53806001)(4396001)(19580395003)(83322001)(85306002)(33646001)(83072001)(19300405004)(46102001)(24736002);DIR:OUT;SFP:;SCL:1;SRVR:HKNPR03MB145;H:HKNPR03MB146.apcprd03.prod.outlook.com;CLIP:66.170.122.253;FPR:;RD:InfoNoRecords;A:1;MX:3;LANG:en; Content-Type: multipart/alternative; boundary="_000_ae5e7091c7a94ea2b14a3a23744d235bHKNPR03MB146apcprd03pro_" MIME-Version: 1.0 X-OriginatorOrg: 247-inc.com X-Virus-Checked: Checked by ClamAV on apache.org --_000_ae5e7091c7a94ea2b14a3a23744d235bHKNPR03MB146apcprd03pro_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable During performance testing being run on our 4 node Cassandra 1.1.5 cluster,= we are seeing warning logs about the heap being almost full - [1]. I'm try= ing to figure out why, and how to prevent it. The tests are being run on a Cassandra ring consisting of 4 dedicated boxes= with 32 GB of RAM each. The heap size is set to 8 GB as recommended. All the other relevant settings I know off are the default ones: - memtable_total_space_in_mb is not set in the yaml, so should def= ault to 1/3rd the heap size. - They key cache should be 100 MB at the most. I checked the key c= ache the day after the tests were run via nodetool info, and it reported 4.= 5 MB being used. - row cache is not being used - I summed up the bloom filter usage reported by nodetool cfstats = in all the CFs and it was under 50 MB. The resident size of the cassandra process accd to top is 8.4g even now. Di= d a heap histogram using jmap, but not sure how to interpret those results = usefully - [2]. Performance test details: - The test is write only, and is writing relatively large amount o= f data to one CF. - There is some other traffic that is constantly on that writes sm= aller amounts of data to many CFs, and does some reads. The total number of CFs are 114, but quite a few of them are not used. Thanks, Arindam [1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1: WARN GCInspector.java (lin= e 145) Heap is 0.8287082580489245 full. You may need to reduce memtable an= d/or cache sizes. Cassandra will now flush up to the two largest memtables= to free up memory. Adjust flush_largest_memtables_at threshold in cassand= ra.yaml if you don't want Cassandra to do this automatically [2] Object Histogram: num #instances #bytes Class description -------------------------------------------------------------------------- 1: 152855 86035312 int[] 2: 13395 45388008 long[] 3: 49517 9712000 java.lang.Object[] 4: 120094 8415560 char[] 5: 145106 6965088 java.nio.HeapByteBuffer 6: 40525 5891040 * ConstMethodKlass 7: 231258 5550192 java.lang.Long 8: 40525 5521592 * MethodKlass 9: 134574 5382960 java.math.BigInteger 10: 36692 4403040 java.net.SocksSocketImpl 11: 3741 4385048 * ConstantPoolKlass 12: 63875 3538128 * SymbolKlass 13: 104048 3329536 java.lang.String 14: 132636 3183264 org.apache.cassandra.db.DecoratedKey 15: 97466 3118912 java.util.concurrent.ConcurrentHashMap$Hash= Entry 16: 97216 3110912 com.googlecode.concurrentlinkedhashmap.Conc= urrentLinkedHashMap$Node --_000_ae5e7091c7a94ea2b14a3a23744d235bHKNPR03MB146apcprd03pro_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

 

During performance testing being run on our 4 node C= assandra 1.1.5 cluster, we are seeing warning logs about the heap being alm= ost full – [1]. I’m trying to figure out why, and how to preven= t it.

 

The tests are being run on a Cassandra ring consisti= ng of 4 dedicated boxes with 32 GB of RAM each.

The heap size is set to 8 GB as recommended.

All the other relevant settings I know off are the d= efault ones:

-     &= nbsp;    memtable_total_space_in_mb is not set in the yaml, = so should default to 1/3rd the heap size.

-     &= nbsp;    They key cache should be 100 MB at the most. I chec= ked the key cache the day after the tests were run via nodetool info, and i= t reported 4.5 MB being used.

-     &= nbsp;    row cache is not being used

-     &= nbsp;    I summed up the bloom filter usage reported by node= tool cfstats in all the CFs and it was under 50 MB.

 

The resident size of t= he cassandra process accd to top is 8.4g even now. Did a heap histogram usi= ng jmap, but not sure how to interpret those results usefully – [2].<= o:p>

 

Performance test detai= ls:

-&nb= sp;         The test is wr= ite only, and is writing relatively large amount of data to one CF.

-&nb= sp;         There is some = other traffic that is constantly on that writes smaller amounts of data to = many CFs, and does some reads.

 

The total number of CF= s are 114, but quite a few of them are not used.

 

Thanks,

Arindam

 

[1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1: = ; WARN GCInspector.java (line 145) Heap is 0.8287082580489245 full.  Y= ou may need to reduce memtable and/or cache sizes.  Cassandra will now= flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yam= l if you don't want Cassandra to do this automatically

 

[2] Object Histogram:<= o:p>

 

num   &= nbsp;   #instances    #bytes  Class descripti= on

----------------------= ----------------------------------------------------

1:   &n= bsp;          152855  860= 35312        int[]

2:   &n= bsp;          13395  = ; 45388008        long[]

3:   &n= bsp;          49517  = ; 9712000 java.lang.Object[]

4:   &n= bsp;          120094  841= 5560 char[]

5:   &n= bsp;          145106  696= 5088 java.nio.HeapByteBuffer

6:   &n= bsp;          40525  = ; 5891040 * ConstMethodKlass

7:   &n= bsp;          231258  555= 0192 java.lang.Long

8:   &n= bsp;          40525  = ; 5521592 * MethodKlass

9:   &n= bsp;          134574  538= 2960 java.math.BigInteger

10:   &= nbsp;         36692   440= 3040 java.net.SocksSocketImpl

11:   &= nbsp;         3741   = ; 4385048 * ConstantPoolKlass

12:   &= nbsp;         63875   353= 8128 * SymbolKlass

13:   &= nbsp;         104048  3329536 = java.lang.String

14:   &= nbsp;         132636  3183264 = org.apache.cassandra.db.DecoratedKey

15:   &= nbsp;         97466   311= 8912 java.util.concurrent.ConcurrentHashMap$HashEntry

16:   &= nbsp;         97216   311= 0912 com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node

 

--_000_ae5e7091c7a94ea2b14a3a23744d235bHKNPR03MB146apcprd03pro_--