From user-return-36863-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Mon Sep 30 07:37:53 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E5CDA109C5 for ; Mon, 30 Sep 2013 07:37:53 +0000 (UTC) Received: (qmail 62542 invoked by uid 500); 30 Sep 2013 07:37:39 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 62391 invoked by uid 500); 30 Sep 2013 07:37:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 62308 invoked by uid 99); 30 Sep 2013 07:37:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Sep 2013 07:37:28 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [209.85.220.45] (HELO mail-pa0-f45.google.com) (209.85.220.45) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Sep 2013 07:37:23 +0000 Received: by mail-pa0-f45.google.com with SMTP id rd3so5503010pab.32 for ; Mon, 30 Sep 2013 00:37:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-type:message-id:mime-version :subject:date:references:to:in-reply-to; bh=FVlQUHEf0kIYaiuicwm4jFXexntaT6+xv7zwT48+gGQ=; b=gGsGw6QfTIIcwD81uUYSNzhd65G35du6fbI9/6V/7G5UlMauw2HEtK8yytMwCKkn2C wQI2nRaKk8YnIVWVPqT9cGbeuxz/HhySHjESy+Rl4g8oUS38y8dTQZgOKYLctHW0DrOL KUuHwv1RokCk4ne/IUmYT7DhYKTPkaN7TIUOVE2C2mIQ1kUEBj+KIk0DvdhgvU1IWjnc nz+sd5W6xTDxVV7UbilT9rrqhiSraj4Kvfh6fN2osj+i/CoLhu+DWKP3OpoSqHgiK12b OQ0eA9mO2mGjBSdyMuwZgMggXcbhtdGa7VGPbmONo9aKZROq3hUURG0Q0VvYX6oEsu49 iM5A== X-Gm-Message-State: ALoCoQnSDxEmEyAOm3Xanc4fpHOKwhK+Zm8j7xx2l2rhLgKqBm0aR3ujxh/fkaTYH2UucmMpV1Ng X-Received: by 10.68.198.68 with SMTP id ja4mr22298238pbc.24.1380526623197; Mon, 30 Sep 2013 00:37:03 -0700 (PDT) Received: from [172.16.1.18] ([203.86.207.101]) by mx.google.com with ESMTPSA id y5sm25034379pbs.18.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 30 Sep 2013 00:37:02 -0700 (PDT) From: Aaron Morton Content-Type: multipart/alternative; boundary="Apple-Mail=_A08F5E1B-BF84-4BBB-AA4E-B3BAFD5097ED" Message-Id: <74827C78-CEC1-4FD9-91B9-55322D15A6E0@thelastpickle.com> Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) Subject: Re: 2.0.1 counter replicate on write error Date: Mon, 30 Sep 2013 20:36:58 +1300 References: <01f201cebb6f$5a5050a0$0ef0f1e0$@struq.com> To: user@cassandra.apache.org In-Reply-To: <01f201cebb6f$5a5050a0$0ef0f1e0$@struq.com> X-Mailer: Apple Mail (2.1510) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_A08F5E1B-BF84-4BBB-AA4E-B3BAFD5097ED Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 > ERROR [ReplicateOnWriteStage:19] 2013-09-27 10:17:14,778 = CassandraDaemon.java (line 185) Exception in thread = Thread[ReplicateOnWriteStage:19,5,main] > java.lang.AssertionError: DecoratedKey(-1754949563326053382, = a414b0c07f0547f8a75410555716ced6) !=3D = DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0) in = /disk3/cassandra/data/struqrealtime/counters/struqrealtime-counters-jb-831= 953-Data.db > at = org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNa= mesIterator.java:114) > at = org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTable= NamesIterator.java:62) When reading from an SSTable the position returned from the -Index.db / = KEYS cache pointed to a row in the -Data.db component that was for a = different row.=20 > DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0) Was what we were searching for > DecoratedKey(-1754949563326053382, a414b0c07f0547f8a75410555716ced6) Is what was found in the data component.=20 The first part is the Token (M3 hash) the second is the key. It looks = like a collision, but it could also be a bug somewhere else.=20 Code in SSTableReader.getPosition() points to = https://issues.apache.org/jira/browse/CASSANDRA-4687 and adds an = assertion that is only trigger if TRACE logging is running. Can you add = to the 4687 ticket and update the thread ?=20 Cheers ----------------- Aaron Morton New Zealand @aaronmorton Co-Founder & Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 27/09/2013, at 10:50 PM, Christopher Wirt = wrote: > Hello, > =20 > I=92ve started to see a slightly worrying error appear in our logs = occasionally. We=92re writing at 400qps per machine and I only see this = appear every 5-10minutes. > =20 > Seems to have started when I switched us to using the hsha thrift = server this morning. We=92ve been running 2.0.1 ran off the sync thrift = server since yesterday without seeing this error. But might not be = related. > =20 > There are some machines in another DC still running 1.2.10. > =20 > Anyone seen this before? Have any insight? > =20 > ERROR [ReplicateOnWriteStage:19] 2013-09-27 10:17:14,778 = CassandraDaemon.java (line 185) Exception in thread = Thread[ReplicateOnWriteStage:19,5,main] > java.lang.AssertionError: DecoratedKey(-1754949563326053382, = a414b0c07f0547f8a75410555716ced6) !=3D = DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0) in = /disk3/cassandra/data/struqrealtime/counters/struqrealtime-counters-jb-831= 953-Data.db > at = org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNa= mesIterator.java:114) > at = org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTable= NamesIterator.java:62) > at = org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(N= amesQueryFilter.java:87) > at = org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryF= ilter.java:62) > at = org.apache.cassandra.db.CollationController.collectAllData(CollationContro= ller.java:249) > at = org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationCo= ntroller.java:53) > at = org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyS= tore.java:1468) > at = org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStor= e.java:1294) > at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332) > at = org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCom= mand.java:55) > at = org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMut= ation.java:100) > at = org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.jav= a:1107) > at = org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StoragePro= xy.java:1897) > at = java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:= 1145) > at = java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java= :615) > at java.lang.Thread.run(Thread.java:724) --Apple-Mail=_A08F5E1B-BF84-4BBB-AA4E-B3BAFD5097ED Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252
ERROR = [ReplicateOnWriteStage:19] 2013-09-27 10:17:14,778 CassandraDaemon.java = (line 185) Exception in thread = Thread[ReplicateOnWriteStage:19,5,main]
java.lang.AssertionError: = DecoratedKey(-1754949563326053382, a414b0c07f0547f8a75410555716ced6) !=3D = DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0) in = /disk3/cassandra/data/struqrealtime/counters/struqrealtime-counters-jb-831= 953-Data.db
DecoratedKey(-1754949563326053382, = a414b0c07f0547f8a75410555716ced6)
Is what = was found in the data component. 

The = first part is the Token (M3 hash) the second is the key. It looks like a = collision, but it could also be a bug somewhere = else. 

Code in SSTableReader.getPosition() = points to https://issu= es.apache.org/jira/browse/CASSANDRA-4687 and adds an assertion = that is only trigger if TRACE logging is running. Can you add to the = 4687 ticket and update the thread = ? 

Cheers

http://www.thelastpickle.com

On 27/09/2013, at 10:50 PM, Christopher Wirt <chris.wirt@struq.com> = wrote:

Hello,
I=92ve started to = see a slightly worrying error appear in our logs occasionally. We=92re = writing at 400qps per machine and I only see this appear every = 5-10minutes.
Seems to have = started when I switched us to using the hsha thrift server this morning. = We=92ve been running 2.0.1 ran off the sync thrift server since = yesterday without seeing this error.  But might not be = related.
There are some = machines in another DC still running 1.2.10.
 
 
ERROR = [ReplicateOnWriteStage:19] 2013-09-27 10:17:14,778 CassandraDaemon.java = (line 185) Exception in thread = Thread[ReplicateOnWriteStage:19,5,main]
java.lang.AssertionError: = DecoratedKey(-1754949563326053382, a414b0c07f0547f8a75410555716ced6) !=3D = DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0) in = /disk3/cassandra/data/struqrealtime/counters/struqrealtime-counters-jb-831= 953-Data.db
        at = org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCom= mand.java:55)