Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 915B29B99 for ; Thu, 14 Mar 2013 13:26:24 +0000 (UTC) Received: (qmail 13521 invoked by uid 500); 14 Mar 2013 13:26:22 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 13486 invoked by uid 500); 14 Mar 2013 13:26:21 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 13468 invoked by uid 99); 14 Mar 2013 13:26:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 13:26:21 +0000 X-ASF-Spam-Status: No, hits=3.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_SOFTFAIL,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of andras.szerdahelyi@ignitionone.com does not designate 216.82.243.205 as permitted sender) Received: from [216.82.243.205] (HELO mail1.bemta8.messagelabs.com) (216.82.243.205) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 13:26:13 +0000 Received: from [216.82.241.196:56545] by server-13.bemta-8.messagelabs.com id CB/C8-00556-0EFC1415; Thu, 14 Mar 2013 13:25:52 +0000 X-Env-Sender: andras.szerdahelyi@ignitionone.com X-Msg-Ref: server-9.tower-46.messagelabs.com!1363267551!28189307!1 X-Originating-IP: [208.52.173.250] X-StarScan-Received: X-StarScan-Version: 6.8.6.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 5167 invoked from network); 14 Mar 2013 13:25:51 -0000 Received: from mail.dentsunetwork.com (HELO mail.dentsunetwork.com) (208.52.173.250) by server-9.tower-46.messagelabs.com with AES128-SHA encrypted SMTP; 14 Mar 2013 13:25:51 -0000 Received: from ATL02MB02.corp.local ([fe80::7997:c980:b031:df37]) by ATL02HUB02.corp.local ([::1]) with mapi id 14.02.0318.004; Thu, 14 Mar 2013 09:25:52 -0400 From: Andras Szerdahelyi To: "user@cassandra.apache.org" Subject: 33million hinted handoffs from nowhere Thread-Topic: 33million hinted handoffs from nowhere Thread-Index: AQHOILdyl2cNzuTE1EeKr+jnH9CvSw== Date: Thu, 14 Mar 2013 13:25:51 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.3.1.130117 x-originating-ip: [10.0.90.4] Content-Type: multipart/alternative; boundary="_000_FE84AE7AAE9A2B4EA73512EED142BEF830CAEC5BATL02MB02corplo_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_FE84AE7AAE9A2B4EA73512EED142BEF830CAEC5BATL02MB02corplo_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Hi list, I am experiencing seemingly uncontrollable and unexplained growth of my Hin= tedHandoff CF on a single node. Unexplained because there are no hinted han= doffs being logged on the node, uncontrollable because I see 33 million ins= erts in cfstats and the size of the stables is over 10 gigs all in an hour = of uptime. I have done the following to try and reproduce this: - shut down my cluster - on all nodes: remove sstables from the HintsColumnFamily data dir - on all nodes: remove commit logs - start all nodes but the one that=92s showing this problem - nothing is writing to any of the nodes. There are no hinted handoff going= on anywhere - bring back the node in question last - few seconds after boot: Column Family: HintsColumnFamily SSTable count: 1 Space used (live): 44946532 Space used (total): 44946532 Number of Keys (estimate): 256 Memtable Columns Count: 17840 Memtable Data Size: 17569909 Memtable Switch Count: 2 Read Count: 0 Read Latency: NaN ms. Write Count: 184836 Write Latency: 0.668 ms. Pending Tasks: 0 Bloom Filter False Postives: 0 Bloom Filter False Ratio: 0.00000 Bloom Filter Space Used: 16 Compacted row minimum size: 20924301 Compacted row maximum size: 25109160 Compacted row mean size: 25109160 --_000_FE84AE7AAE9A2B4EA73512EED142BEF830CAEC5BATL02MB02corplo_ Content-Type: text/html; charset="Windows-1252" Content-ID: <536FA37F831AC446B2E613A4866CCD80@corp.local> Content-Transfer-Encoding: quoted-printable
Hi list,

I am experiencing seemingly uncontrollable and unexplained growth of m= y HintedHandoff CF on a single node. Unexplained because there are no hinte= d handoffs being logged on the node, uncontrollable because I see 33 millio= n inserts in cfstats and the size of the stables is over 10 gigs all in an hour of uptime. 


I have done the following to try and reproduce this:

- shut down my cluster
- on all nodes: remove sstables from the HintsColumnFamily d= ata dir
- on all nodes: remove commit logs
- start all nodes but the one that=92s showing this problem
- nothing is writing to any of the nodes. There are no hinted handoff = going on anywhere
- bring back the node in question last
- few seconds after boot:

                Column Family:= HintsColumnFamily
                SSTable count:= 1
                Space used (li= ve): 44946532
                Space used (to= tal): 44946532
                Number of Keys= (estimate): 256
                Memtable Colum= ns Count: 17840
                Memtable Data = Size: 17569909
                Memtable Switc= h Count: 2
                Read Count: 0<= /div>
                Read Latency: = NaN ms.
                Write Count: 1= 84836
                Write Latency:= 0.668 ms.
                Pending Tasks:= 0
                Bloom Filter F= alse Postives: 0
                Bloom Filter F= alse Ratio: 0.00000
                Bloom Filter S= pace Used: 16
                Compacted row = minimum size: 20924301
                Compacted row = maximum size: 25109160
                Compacted row = mean size: 25109160




--_000_FE84AE7AAE9A2B4EA73512EED142BEF830CAEC5BATL02MB02corplo_--