Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A1CD7184C3 for ; Wed, 10 Jun 2015 18:32:32 +0000 (UTC) Received: (qmail 12313 invoked by uid 500); 10 Jun 2015 18:32:30 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 12273 invoked by uid 500); 10 Jun 2015 18:32:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 12260 invoked by uid 99); 10 Jun 2015 18:32:30 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Jun 2015 18:32:29 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 8AE981827EA for ; Wed, 10 Jun 2015 18:32:29 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.481 X-Spam-Level: **** X-Spam-Status: No, score=4.481 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, KAM_WEIRDTRICK1=1.5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id pz_XWOVluYby for ; Wed, 10 Jun 2015 18:32:17 +0000 (UTC) Received: from mail-wi0-f182.google.com (mail-wi0-f182.google.com [209.85.212.182]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 9812D27634 for ; Wed, 10 Jun 2015 18:32:16 +0000 (UTC) Received: by wiwd19 with SMTP id d19so56916949wiw.0 for ; Wed, 10 Jun 2015 11:32:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=P81R7eV7zKlMECCvoM01soDF5Plueqp9zxvba1ksGBM=; b=ddkgWVuWE2ijL1Cv4fAS9exN9mA0oWH15kwLugsBVslAcjHBwS+e4Wyv1rFjWlnC3+ sMtedOVtndmMVGOQ76W8AE4R3Cg+Xf40iXpDnqzKsGOZt0GFBX/2kJedtP+E8jTK4wO0 iT9s4evzudytcZ/ZA18ov2m+98NjEpqEm8vawnJDwlJIxk6Ly4DsPeQlnkYisr7lOHlN JY+r58yyXnAtPPhGG7kWR/DnzPgaZkbal3ahU2kLXWKMKSdjT9QuxLYSrFD2/HefFoU6 /2W+A+krn5f/5G4AOtaaBtDnn6GktNEeAkJexagX85+Kl4jie/8sXhl5FOAMvmpZtyjy s0qw== X-Gm-Message-State: ALoCoQmpcBMIIBISKgaYqZicDybuKovoM6P3iwBFLs139Msno/ntWSBb1WVOr3F6Ow3rvDUyrAOU X-Received: by 10.180.97.7 with SMTP id dw7mr21800403wib.74.1433961135151; Wed, 10 Jun 2015 11:32:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.44.3 with HTTP; Wed, 10 Jun 2015 11:31:44 -0700 (PDT) In-Reply-To: <1433862663.54023.YahooMailAndroidMobile@web192903.mail.sg3.yahoo.com> References: <1433862663.54023.YahooMailAndroidMobile@web192903.mail.sg3.yahoo.com> From: Ken Hancock Date: Wed, 10 Jun 2015 14:31:44 -0400 Message-ID: Subject: Re: Hundreds of sstables after every Repair To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=f46d0443044015365605182e1835 --f46d0443044015365605182e1835 Content-Type: text/plain; charset=UTF-8 Perhaps doing a sstable2json on some of the small tables would shed some illumination. I was going to suggest the anticompaction feature of C*2.1 (which I'm not familiar with), but you're on 2.0. On Tue, Jun 9, 2015 at 11:11 AM, Anuj Wadehra wrote: > We were facing dropped mutations earlier and we increased flush writers. > Now there are no dropped mutations in tpstats. To repair the damaged vnodes > / inconsistent data we executed repair -pr on all nodes. Still, we see the > same problem. > > When we analyze repair logs we see 2 strange things: > > 1. "Out of sync" ranges for cf which are not being actively being > written/updated while the repair is going on. When we repaired all data by > repair -pr on all nodes, why out of sync data? > > 2. For some cf , repair logs shows that all ranges are consistent. Still > we get so many sstables created during repair. When everything is in sync , > why repair creates tiny sstables to repair data? > > Thanks > Anuj Wadehra > > Sent from Yahoo Mail on Android > > ------------------------------ > *From*:"Ken Hancock" > *Date*:Tue, 9 Jun, 2015 at 8:24 pm > *Subject*:Re: Hundreds of sstables after every Repair > > I think this came up recently in another thread. If you're getting large > numbers of SSTables after repairs, that means that your nodes are diverging > from the keys that they're supposed to be having. Likely you're dropping > mutations. Do a nodetool tpstats on each of your nodes and look at the > mutation droppped counters. If you're seeing dropped message, my money you > have a non-zero FlushWriter "All time blocked" stat which is causing > mutations to be dropped. > > > > On Tue, Jun 9, 2015 at 10:35 AM, Anuj Wadehra > wrote: > >> Any suggestions or comments on this one? >> >> Thanks >> Anuj Wadehra >> >> Sent from Yahoo Mail on Android >> >> ------------------------------ >> *From*:"Anuj Wadehra" >> *Date*:Sun, 7 Jun, 2015 at 1:54 am >> *Subject*:Hundreds of sstables after every Repair >> >> Hi, >> >> We are using 2.0.3 and vnodes. After every repair -pr operation 50+ tiny >> sstables( <10K) get created. And these sstables never get compacted due to >> coldness issue. I have raised >> https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but >> I have been told to upgrade. Till we upgrade to latest 2.0.x , we are >> stuck. Upgrade takes time, testing and planning in Production systems :( >> >> I have observed that even if vnodes are NOT damaged, hundreds of tiny >> sstables are created during repair for a wide row CF. This is beyond my >> understanding. If everything is consistent, and for the entire repair >> process Cassandra is saying "Endpoints /x.x.x.x and /x.x.x.y are consistent >> for ". Whats the need of creating sstables? >> >> Is there any alternative to regular major compaction to deal with >> situation? >> >> >> Thanks >> Anuj Wadehra >> >> > > > > > > --f46d0443044015365605182e1835 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Perhaps doing a sstable2json on some of the small tables w= ould shed some illumination.=C2=A0 I was going to suggest the anticompactio= n feature of C*2.1 (which I'm not familiar with), but you're on 2.0= .

On Tue, Jun = 9, 2015 at 11:11 AM, Anuj Wadehra <anujw_2003@yahoo.co.in> wrote:
We were facing drop= ped mutations earlier and we increased flush writers. Now there are no drop= ped mutations in tpstats. To repair the damaged vnodes / inconsistent data = we executed repair -pr on all nodes. Still, we see the same problem.=C2=A0<= div>
When we analyze repair logs we see 2 strange things:
1. "Out of sync" ranges for cf which are not being acti= vely being written/updated while the repair is going on. When we repaired a= ll data by repair -pr on all nodes, why out of sync data?

2. For some cf , repair logs shows that all ranges are consistent. Still we get so many sstables created during repair= . When everything is in sync , why repair creates tiny sstables to repair d= ata?

Thanks
= From:"Ken Hancock" <ken.hancock@schange.com>
Date:Tue, = 9 Jun, 2015 at 8:24 pm
Subject:Re: Hundreds of sstables after eve= ry Repair

I think thi= s came up recently in another thread.=C2=A0 If you're getting large num= bers of SSTables after repairs, that means that your nodes are diverging from the keys that they're supposed to be having.=C2=A0 Like= ly you're dropping mutations.=C2=A0 Do a nodetool tpstats on each of yo= ur nodes and look at the mutation droppped counters.=C2=A0 If you're se= eing dropped message, my money you have a non-zero FlushWriter "All ti= me blocked" stat which is causing mutations to be dropped.



On Tue, Jun 9, 2015 at 10:3= 5 AM, Anuj Wadehra <anujw_2003@yahoo.co.in> wrote:
Any suggestions or comments on this one?

Thanks
Anuj Wadehra

Sent from Yahoo Mail on Android=


<= b>From:"Anuj Wadehra" <= anujw_2003@yahoo.co.in>
Date:Sun, 7 Jun, 20= 15 at 1:54 am
Subject:Hundreds of sstables after e= very Repair

Hi,

We are using 2.0.3 and vnodes. After e= very repair -pr operation=C2=A0 50+ tiny sstables( <10K) get created. An= d these sstables never get compacted due to coldness issue. I have raised <= a rel=3D"nofollow" shape=3D"rect" href=3D"https://issues.apache.org/jira/br= owse/CASSANDRA-9146" target=3D"_blank">https://issues.apache.org/jira/brows= e/CASSANDRA-9146 for this issue but I have been told to upgrade. Till w= e upgrade to latest 2.0.x , we are stuck. Upgrade takes time, testing and p= lanning in Production systems :(
<= br clear=3D"none">
I have observed that even if vnode= s are NOT damaged, hundreds of tiny sstables are created during repair for = a wide row CF. This is beyond my understanding. If everything is consistent, and for the entire repair proc= ess Cassandra is saying "Endpoints /x.x.x.x and /x.x.x.y are consisten= t for <CF>". Whats the need of creating sstables?

Is there any alternative= to regular major compaction to deal with situation?


Thanks
Anuj Wadehra

<= /div>








<= div class=3D"gmail_signature"> --f46d0443044015365605182e1835--