From user-return-24594-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue Mar 6 21:33:09 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E9383972B for ; Tue, 6 Mar 2012 21:33:09 +0000 (UTC) Received: (qmail 27618 invoked by uid 500); 6 Mar 2012 21:33:07 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 27587 invoked by uid 500); 6 Mar 2012 21:33:07 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 27576 invoked by uid 99); 6 Mar 2012 21:33:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 21:33:07 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Michael.Frisch@nuance.com designates 198.71.66.80 as permitted sender) Received: from [198.71.66.80] (HELO som-mx-a.nuance.com) (198.71.66.80) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 21:33:02 +0000 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap8EAMKBVk8KHBQY/2dsb2JhbABDgkWzQIF9AQEBBC0oCioCAQgNBAQBAQsWBwcyFAkIAQEEEwjBW4o1hVhjBJtEihGCY4FW Received: from unknown (HELO SOM-CAS01.nuance.com) ([10.28.20.24]) by som-mx-a.nuance.com with ESMTP/TLS/AES128-SHA; 06 Mar 2012 16:20:21 -0500 Received: from SOM-EXCH01.nuance.com ([fe80::c97e:c1ac:c6ff:8cfb]) by SOM-CAS01.nuance.com ([::1]) with mapi id 14.01.0339.001; Tue, 6 Mar 2012 16:32:38 -0500 From: "Frisch, Michael" To: "user@cassandra.apache.org" Subject: RE: Secondary indexes don't go away after metadata change Thread-Topic: Secondary indexes don't go away after metadata change Thread-Index: Acz3sdIW/OHFgY6/SxaUJZwiH9Zf5gDJjEUAABA70/AAIp0WgAAPNLmA Date: Tue, 6 Mar 2012 21:32:37 +0000 Message-ID: <5D5B7938C0FA6D418BF036797F42AF6818F84920@SOM-EXCH01.nuance.com> References: <5D5B7938C0FA6D418BF036797F42AF6818F82976@SOM-EXCH01.nuance.com> <55D38831-0861-4229-87A3-08982C078B8D@thelastpickle.com> <5D5B7938C0FA6D418BF036797F42AF6818F84380@SOM-EXCH01.nuance.com> <2E70C991-4E00-42BF-8DB5-70E3D681B327@thelastpickle.com> In-Reply-To: <2E70C991-4E00-42BF-8DB5-70E3D681B327@thelastpickle.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.28.16.110] Content-Type: multipart/alternative; boundary="_000_5D5B7938C0FA6D418BF036797F42AF6818F84920SOMEXCH01nuance_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_5D5B7938C0FA6D418BF036797F42AF6818F84920SOMEXCH01nuance_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Sure enough it does. Looking back in the logs when the node was first comi= ng online I can see it applying migrations and submitting index builds on i= ndexes that are deleted in the newest version of the schema. This may be a= silly question but shouldn't it just apply the most recent version of the = schema on a new node? Is there a reason to apply the migrations? - Mike From: aaron morton [mailto:aaron@thelastpickle.com] Sent: Tuesday, March 06, 2012 4:14 AM To: user@cassandra.apache.org Subject: Re: Secondary indexes don't go away after metadata change When the new node comes online the history of schema changes are streamed t= o it. I've not looked at the code but it could be that schema migrations ar= e creating Indexes. That are then deleted from the schema but not from the = DB it's self. Does that fit your scenario ? When the new node comes online does it log mi= grations been applied and then indexes been created ? Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/03/2012, at 10:56 AM, Frisch, Michael wrote: Thank you very much for your response. It is true that the older, previous= ly existing nodes are not snapshotting the indexes that I had removed. I'l= l go ahead and just delete those SSTables from the data directory. They ma= y be around still because they were created back when we used 0.8. The more troubling issue is with adding new nodes to the cluster though. I= t built indexes for column families that have had all indexes dropped weeks= or months in the past. It also will snapshot the index SSTables that it c= reated. The index files are non-empty as well, some are hundreds of megaby= tes. All nodes have the same schema, none list themselves as having the rows ind= exed. I cannot drop the indexes via the CLI either because it says that th= ey don't exist. It's quite perplexing. - Mike From: aaron morton [mailto:aaron@thelastpickle.com] Sent: Monday, March 05, 2012 3:58 AM To: user@cassandra.apache.org Subject: Re: Secondary indexes don't go away after metadata change The secondary index CF's are marked as no longer required / marked as compa= cted. under 1.x they would then be deleted reasonably quickly, and definite= ly deleted after a restart. Is there a zero length .Compacted file there ? Also, when adding a new node to the ring the new node will build indexes fo= r the ones that supposedly don't exist any longer. Is this supposed to hap= pen? Would this have happened if I had deleted the old SSTables from the p= reviously existing nodes? Check you have a consistent schema using describe cluster in the CLI. And c= heck the schema is what you think it is using show schema. Another trick is to do a snapshot. Only the files in use are included the s= napshot. Hope that helps. ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 2/03/2012, at 2:53 AM, Frisch, Michael wrote: I have a few column families that I decided to get rid of the secondary ind= exes on. I see that there aren't any new index SSTables being created, but= all of the old ones remain (some from as far back as September). Is it sa= fe to just delete then when the node is offline? Should I run clean-up or = scrub? Also, when adding a new node to the ring the new node will build indexes fo= r the ones that supposedly don't exist any longer. Is this supposed to hap= pen? Would this have happened if I had deleted the old SSTables from the p= reviously existing nodes? The nodes in question have either been upgraded from v0.8.1 =3D> v1.0.2 (sc= rubbed at this time) =3D> v1.0.6 or from v1.0.2 =3D> v1.0.6. The secondary= index was dropped when the nodes were version 1.0.6. The new node added w= as also 1.0.6. - Mike --_000_5D5B7938C0FA6D418BF036797F42AF6818F84920SOMEXCH01nuance_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Sure enough it does. = ; Looking back in the logs when the node was first coming online I can see = it applying migrations and submitting index builds on indexes that are deleted in the newest version of the schema.  This may be a = silly question but shouldn’t it just apply the most recent version of= the schema on a new node?  Is there a reason to apply the migrations?=

 <= /p>

- Mike<= /p>

 <= /p>

From: aaron mo= rton [mailto:aaron@thelastpickle.com]
Sent: Tuesday, March 06, 2012 4:14 AM
To: user@cassandra.apache.org
Subject: Re: Secondary indexes don't go away after metadata change

 

When the new node comes online the history of schema= changes are streamed to it. I've not looked at the code but it could be th= at schema migrations are creating Indexes. That are then deleted from the s= chema but not from the DB it's self.

 

Does that fit your scenario ? When the new node come= s online does it log migrations been applied and then indexes been created = ?

 

Cheers

 

-----------------

Aaron Morton

Freelance Developer<= /o:p>

@aaronmorton

 

On 6/03/2012, at 10:56 AM, Frisch, Michael wrote:



Thank you very much for y= our response.  It is true that the older, previously existing nodes ar= e not snapshotting the indexes that I had removed.  I’ll go ahea= d and just delete those SSTables from the data directory.  They may be = around still because they were created back when we used 0.8.

 <= /p>

The more troubling issue = is with adding new nodes to the cluster though.  It built indexes for = column families that have had all indexes dropped weeks or months in the past.  It also will snapshot the index SSTables that it create= d.  The index files are non-empty as well, some are hundreds of megaby= tes.

 <= /p>

All nodes have the same s= chema, none list themselves as having the rows indexed.  I cannot drop= the indexes via the CLI either because it says that they don’t exist.  It’s quite perplexing.

 <= /p>

- Mike<= /p>

 <= /p>

 <= /p>

From: aaron morton [mailto:aaron@t= helastpickle.com]  Sent: Monday, Marc= h 05, 2012 3:58 AM
To: user@cassandra.apache.org
Subject: Re: Secon= dary indexes don't go away after metadata change

 

The secondary index CF's are marked as no longer req= uired / marked as compacted. under 1.x they would then be deleted reasonabl= y quickly, and definitely deleted after a restart. 

 

Is there a zero length .Compacted file there ? =

 

Also, when adding a new node to the rin= g the new node will build indexes for the ones that supposedly don’t = exist any longer.  Is this supposed to happen?  Would this have happened if I had deleted the old SSTables from the previously existing no= des?

Check you have a consistent schema using describe cl= uster in the CLI. And check the schema is what you think it is using show s= chema. 

 

Another trick is to do a snapshot. Only the files in= use are included the snapshot. 

 

Hope that helps. 

 

-----------------<= o:p>

Aaron Morton<= /o:p>

Freelance Developer

@aaronmorton<= /o:p>

 

On 2/03/2012, at 2:53 AM, Frisch, Michael wrote:




I have a few column families that I dec= ided to get rid of the secondary indexes on.  I see that there aren= 217;t any new index SSTables being created, but all of the old ones remain (some from as far back as September).  Is it safe to just dele= te then when the node is offline?  Should I run clean-up or scrub?

 

Also, when adding a new node to the rin= g the new node will build indexes for the ones that supposedly don’t = exist any longer.  Is this supposed to happen?  Would this have happened if I had deleted the old SSTables from the previously existing no= des?

 

The nodes in question have either been = upgraded from v0.8.1 =3D> v1.0.2 (scrubbed at this time) =3D> v1.0.6 = or from v1.0.2 =3D> v1.0.6.  The secondary index was dropped when t= he nodes were version 1.0.6.  The new node added was also 1.0.6.<= o:p>

 

- Mike

 

--_000_5D5B7938C0FA6D418BF036797F42AF6818F84920SOMEXCH01nuance_--