From user-return-33174-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu Apr 4 01:09:09 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0C4C3FA36 for ; Thu, 4 Apr 2013 01:09:09 +0000 (UTC) Received: (qmail 7126 invoked by uid 500); 4 Apr 2013 01:09:06 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 7091 invoked by uid 500); 4 Apr 2013 01:09:06 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 7079 invoked by uid 99); 4 Apr 2013 01:09:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Apr 2013 01:09:06 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a47.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Apr 2013 01:09:00 +0000 Received: from homiemail-a47.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a47.g.dreamhost.com (Postfix) with ESMTP id 71F43284058 for ; Wed, 3 Apr 2013 18:08:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=HrphJ1Tjt6clMjwW3IVNgdT0Dt U=; b=sCBTV8JS6FkcgMqPQJgeN6XYj+zSNMnZYL8gEfPVmCs6kvIL2yS3TsbAgn lYnJu3CH/YYimjfEdthHpm6XWhxgvJbUAh6ZgSXouyK/bTQI98JG0JWZ0GKXxVw7 4BB/3Fh4cAdYy4FAKbHHsvczQSydyHY9cnmlrOxMMo6Tc5E7A= Received: from [172.20.2.191] (unknown [115.112.62.228]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a47.g.dreamhost.com (Postfix) with ESMTPSA id 9BC4B284055 for ; Wed, 3 Apr 2013 18:08:38 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_69CB7616-F9EB-4A78-8B34-FF4F4BA93CFB" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: IndexOutOfBoundsException during repair, streaming Date: Thu, 4 Apr 2013 06:38:34 +0530 References: <9F1E6DEB-A991-4574-95C8-E6DE70B16A09@thelastpickle.com> To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_69CB7616-F9EB-4A78-8B34-FF4F4BA93CFB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 > We deleted and recreated those CFs before moving into > production mode.=20 We have a wiener.=20 The comparator is applying the current schema to the byte value read = from disk (schema on read) which describes a value with more than 2 = components. It's then trying to apply the current schema so it can type = cast the bytes for comparison.=20 Something must have gone wrong in the "deleted" part of your statement = above. We do not store schema with data, so this a problem of changing = the schema in an incompatible way with existing data.=20 nodetool scrub is probably your best bet. I've not checked that it = handles this specific problem, but in general it will drop rows from = SSTables that cannot be read or have some other problem. Best thing to = do is snapshot and copy the data from one prod node to a QA box and run = some tests. hope that helps.=20 ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 3/04/2013, at 2:11 AM, Dane Miller wrote: > On Mon, Apr 1, 2013 at 10:19 PM, aaron morton = wrote: >> ERROR [Thread-232] 2013-04-01 22:22:21,760 CassandraDaemon.java (line >> 133) Exception in thread Thread[Thread-232,5,main] >> java.lang.IndexOutOfBoundsException: index (2) must be less than size = (2) >> at >> = com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:= 305) >> at >> = com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:= 284) >> at >> = com.google.common.collect.RegularImmutableList.get(RegularImmutableList.ja= va:81) >> at >> = org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.= java:96) >>=20 >> Something odd in the schema world perhaps. >>=20 >> Has the schema changed recently? >=20 > No, not recently. But during development we experimented with other > comparator types for those CFs. More info below. >=20 >> Do yo have more than one schema in the cluster ? (describe cluster in >> cassandra-cli) >=20 > I don't think so, that command in cassandra-cli shows just a single > schema in the cluster: >=20 > Cluster Information: > Snitch: org.apache.cassandra.locator.Ec2Snitch > Partitioner: org.apache.cassandra.dht.RandomPartitioner > Schema versions: > 126b31ad-3660-3831-9d4f-c6763c9acc97: [ ...ip list... ] >=20 >=20 > This error happened on a CF where the composite is: (Integer, UTF8) >=20 > I'm a bit stumped about how we could get the an index=3D=3D2 in that = code > pathway. See here: > = https://github.com/apache/cassandra/blob/cassandra-1.2.1/src/java/org/apac= he/cassandra/db/marshal/AbstractCompositeType.java >=20 > ...start at line 63 in compare. >=20 > My Java is terrible, but all of our CompositeTypes are composites of > only two types. Thus the counter 'i' should never get up to 2 (it is > used to access by index to individual comparator types within the > composite), unless the value of the first two components in each of > the column names being compared are equal, which should be impossible. >=20 > During development we experimented with other comparator types for > those CFs. We deleted and recreated those CFs before moving into > production mode. Is there a chance Cassandra 'remembers' these old > types from a deleted CF that shares a name with an existing CF? Could > that be causing improper parsing of comparator column names? >=20 > That we enter this code pathway from a section that seems to want to > clean up tombstones makes me think this is a possibility, that there > is a tombstone somewhere whose composite column name is causing > issues. >=20 > Dane --Apple-Mail=_69CB7616-F9EB-4A78-8B34-FF4F4BA93CFB Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 We deleted and recreated those CFs before = moving into
production mode. We have a = wiener. 

The comparator is applying the current = schema to the byte value read from disk (schema on read) which describes = a value with more than 2 components. It's then trying to apply the = current schema so it can type cast the bytes for = comparison. 

Something must have gone = wrong in the "deleted" part of your statement above. We do not store = schema with data, so this a problem of changing the schema in an = incompatible way with existing = data. 

nodetool scrub is probably your = best bet. I've not checked that it handles this specific problem, but in = general it will drop rows from SSTables that cannot be read or have some = other problem. Best thing to do is snapshot and copy the data from one = prod node to a QA box and run some tests.

hope = that helps. 

http://www.thelastpickle.com

On 3/04/2013, at 2:11 AM, Dane Miller <dane@optimalsocial.com> = wrote:

On Mon, Apr 1, 2013 at 10:19 PM, aaron morton <aaron@thelastpickle.com> = wrote:
ERROR [Thread-232] 2013-04-01 = 22:22:21,760 CassandraDaemon.java (line
133) Exception in thread = Thread[Thread-232,5,main]
java.lang.IndexOutOfBoundsException: index = (2) must be less than size (2)
=       at
com.google.common.base.Precondit= ions.checkElementIndex(Preconditions.java:305)
=       at
com.google.common.base.Precondit= ions.checkElementIndex(Preconditions.java:284)
=       at
com.google.common.collect.Regula= rImmutableList.get(RegularImmutableList.java:81)
=       at
org.apache.cassandra.db.marshal.= CompositeType.getComparator(CompositeType.java:96)

Something odd = in the schema world perhaps.

Has the schema changed = recently?

No, not recently.  But during = development we experimented with other
comparator types for those = CFs.  More info below.

Do yo have = more than one schema in the cluster ? (describe cluster = in
cassandra-cli)

I don't think so, that command = in cassandra-cli shows just a single
schema in the = cluster:

Cluster Information:
  Snitch: = org.apache.cassandra.locator.Ec2Snitch
  Partitioner: = org.apache.cassandra.dht.RandomPartitioner
  Schema = versions:
126b31ad-3660-3831-9d4f-c6763c9acc97: [ ...ip list... = ]


This error happened on a CF where the composite is: = (Integer, UTF8)

I'm a bit stumped about how we could get the an = index=3D=3D2 in that code
pathway.  See here:
https://github= .com/apache/cassandra/blob/cassandra-1.2.1/src/java/org/apache/cassandra/d= b/marshal/AbstractCompositeType.java

...start at line 63 in = compare.

My Java is terrible, but all of our CompositeTypes are = composites of
only two types.  Thus the counter 'i' should never = get up to 2 (it is
used to access by index to individual comparator = types within the
composite), unless the value of the first two = components in each of
the column names being compared are equal, = which should be impossible.

During development we experimented = with other comparator types for
those CFs.  We deleted and = recreated those CFs before moving into
production mode.  Is = there a chance Cassandra 'remembers' these old
types from a deleted = CF that shares a name with an existing CF?  Could
that be = causing improper parsing of comparator column names?

That we = enter this code pathway from a section that seems to want to
clean up = tombstones makes me think this is a possibility, that there
is a = tombstone somewhere whose composite column name is = causing
issues.

Dane

= --Apple-Mail=_69CB7616-F9EB-4A78-8B34-FF4F4BA93CFB--