Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6E89E9E9B for ; Tue, 6 Mar 2012 09:14:53 +0000 (UTC) Received: (qmail 8172 invoked by uid 500); 6 Mar 2012 09:14:51 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 8132 invoked by uid 500); 6 Mar 2012 09:14:51 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 8124 invoked by uid 99); 6 Mar 2012 09:14:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 09:14:51 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a47.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 09:14:42 +0000 Received: from homiemail-a47.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a47.g.dreamhost.com (Postfix) with ESMTP id 16C9E284058 for ; Tue, 6 Mar 2012 01:14:15 -0800 (PST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=h1eX4gx/Rc uFF7Pd792ndlPeAHUr8IAwuXZe7ln7TtPW9Ha04YeUyTUw/pbzwqgSYdjuTpTn76 y1O3kndlbol4qW7VFAWdYgkl8+u+jb50OPc/p7li0nGzcW2o63v/hEzlFye7fN2C ZK42mcrZWUZPMQHwiJeEct6dgMilU2FfI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=EdsjYAJg2yd9KpOb iEe8Tu+ee10=; b=G+xcJzyZzYFmyQUwu+lmYs6JxVnCBBH3AUsTogLjps8Xxsmi kjxPXjeTvgDhWdRNsygJTeC3Nx/Y4rmXpNydDWzpCHAxla8G+NyXbtVYbAvWIF/4 2jQfVwqVFF9hgbUNb++wg4acrdfVzKxle1wDlii4IMC514EAN4nK+YCUo/g= Received: from [172.16.1.3] (125-236-193-159.adsl.xtra.co.nz [125.236.193.159]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a47.g.dreamhost.com (Postfix) with ESMTPSA id 4EBAC284056 for ; Tue, 6 Mar 2012 01:14:14 -0800 (PST) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1257) Content-Type: multipart/alternative; boundary="Apple-Mail=_82A02AB2-5B70-4ACF-B827-EA3BBA8C0076" Subject: Re: Secondary indexes don't go away after metadata change Date: Tue, 6 Mar 2012 22:14:11 +1300 In-Reply-To: <5D5B7938C0FA6D418BF036797F42AF6818F84380@SOM-EXCH01.nuance.com> To: user@cassandra.apache.org References: <5D5B7938C0FA6D418BF036797F42AF6818F82976@SOM-EXCH01.nuance.com> <55D38831-0861-4229-87A3-08982C078B8D@thelastpickle.com> <5D5B7938C0FA6D418BF036797F42AF6818F84380@SOM-EXCH01.nuance.com> Message-Id: <2E70C991-4E00-42BF-8DB5-70E3D681B327@thelastpickle.com> X-Mailer: Apple Mail (2.1257) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_82A02AB2-5B70-4ACF-B827-EA3BBA8C0076 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 When the new node comes online the history of schema changes are = streamed to it. I've not looked at the code but it could be that schema = migrations are creating Indexes. That are then deleted from the schema = but not from the DB it's self. Does that fit your scenario ? When the new node comes online does it log = migrations been applied and then indexes been created ? Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/03/2012, at 10:56 AM, Frisch, Michael wrote: > Thank you very much for your response. It is true that the older, = previously existing nodes are not snapshotting the indexes that I had = removed. I=92ll go ahead and just delete those SSTables from the data = directory. They may be around still because they were created back when = we used 0.8. > =20 > The more troubling issue is with adding new nodes to the cluster = though. It built indexes for column families that have had all indexes = dropped weeks or months in the past. It also will snapshot the index = SSTables that it created. The index files are non-empty as well, some = are hundreds of megabytes. > =20 > All nodes have the same schema, none list themselves as having the = rows indexed. I cannot drop the indexes via the CLI either because it = says that they don=92t exist. It=92s quite perplexing. > =20 > - Mike > =20 > =20 > From: aaron morton [mailto:aaron@thelastpickle.com]=20 > Sent: Monday, March 05, 2012 3:58 AM > To: user@cassandra.apache.org > Subject: Re: Secondary indexes don't go away after metadata change > =20 > The secondary index CF's are marked as no longer required / marked as = compacted. under 1.x they would then be deleted reasonably quickly, and = definitely deleted after a restart.=20 > =20 > Is there a zero length .Compacted file there ?=20 > =20 > Also, when adding a new node to the ring the new node will build = indexes for the ones that supposedly don=92t exist any longer. Is this = supposed to happen? Would this have happened if I had deleted the old = SSTables from the previously existing nodes? > Check you have a consistent schema using describe cluster in the CLI. = And check the schema is what you think it is using show schema.=20 > =20 > Another trick is to do a snapshot. Only the files in use are included = the snapshot.=20 > =20 > Hope that helps.=20 > =20 > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > =20 > On 2/03/2012, at 2:53 AM, Frisch, Michael wrote: >=20 >=20 > I have a few column families that I decided to get rid of the = secondary indexes on. I see that there aren=92t any new index SSTables = being created, but all of the old ones remain (some from as far back as = September). Is it safe to just delete then when the node is offline? = Should I run clean-up or scrub? > =20 > Also, when adding a new node to the ring the new node will build = indexes for the ones that supposedly don=92t exist any longer. Is this = supposed to happen? Would this have happened if I had deleted the old = SSTables from the previously existing nodes? > =20 > The nodes in question have either been upgraded from v0.8.1 =3D> = v1.0.2 (scrubbed at this time) =3D> v1.0.6 or from v1.0.2 =3D> v1.0.6. = The secondary index was dropped when the nodes were version 1.0.6. The = new node added was also 1.0.6. > =20 > - Mike --Apple-Mail=_82A02AB2-5B70-4ACF-B827-EA3BBA8C0076 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 When the new node comes online the history of = schema changes are streamed to it. I've not looked at the code but it = could be that schema migrations are creating Indexes. That are then = deleted from the schema but not from the DB it's = self.

Does that fit your scenario ? When the new node = comes online does it log migrations been applied and then indexes been = created ?

Cheers

http://www.thelastpickle.com

On 6/03/2012, at 10:56 AM, Frisch, Michael wrote:

Thank = you very much for your response.  It is true that the older, = previously existing nodes are not snapshotting the indexes that I had = removed.  I=92ll go ahead and just delete those SSTables from the = data directory.  They may be around still because they were created = back when we used 0.8.
 
The more troubling issue is with = adding new nodes to the cluster though.  It built indexes for = column families that have had all indexes dropped weeks or months in the = past.  It also will snapshot the index SSTables that it = created.  The index files are non-empty as well, some are hundreds = of megabytes.
All = nodes have the same schema, none list themselves as having the rows = indexed.  I cannot drop the indexes via the CLI either because it = says that they don=92t exist.  It=92s quite = perplexing.
- = Mike
From: aaron morton = [mailto:aaron@thelastpickle.com] 
Sent: Monday, March 05, 2012 3:58 = AM
To: user@cassandra.apache.orgSubject: Re: = Secondary indexes don't go away after metadata = change
The secondary index CF's are = marked as no longer required / marked as compacted. under 1.x they would = then be deleted reasonably quickly, and definitely deleted after a = restart. 
Is there a zero length = .Compacted file there ? 
 
Also, when adding a new node to the ring the new node will = build indexes for the ones that supposedly don=92t exist any = longer.  Is this supposed to happen?  Would this have happened = if I had deleted the old SSTables from the previously existing = nodes?
Check you have a consistent schema using describe = cluster in the CLI. And check the schema is what you think it is using = show schema. 
Another trick is to do a = snapshot. Only the files in use are included the = snapshot. 
Hope that = helps. 
Aaron = Morton
Freelance = Developer
@aaronmorton