From user-return-35752-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Mon Aug 5 15:29:30 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1B1F6104B9 for ; Mon, 5 Aug 2013 15:29:30 +0000 (UTC) Received: (qmail 28847 invoked by uid 500); 5 Aug 2013 15:29:27 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 28827 invoked by uid 500); 5 Aug 2013 15:29:27 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 28819 invoked by uid 99); 5 Aug 2013 15:29:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Aug 2013 15:29:27 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of kwright@nanigans.com designates 216.82.254.105 as permitted sender) Received: from [216.82.254.105] (HELO mail1.bemta7.messagelabs.com) (216.82.254.105) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Aug 2013 15:29:22 +0000 Received: from [216.82.253.99:30067] by server-9.bemta-7.messagelabs.com id A8/C9-13517-DB4CFF15; Mon, 05 Aug 2013 15:29:01 +0000 X-Env-Sender: kwright@nanigans.com X-Msg-Ref: server-15.tower-160.messagelabs.com!1375716520!7768881!18 X-Originating-IP: [216.166.12.178] X-StarScan-Received: X-StarScan-Version: 6.9.11; banners=-,-,- X-VirusChecked: Checked Received: (qmail 31679 invoked from network); 5 Aug 2013 15:29:00 -0000 Received: from out001.collaborationhost.net (HELO out001.collaborationhost.net) (216.166.12.178) by server-15.tower-160.messagelabs.com with RC4-SHA encrypted SMTP; 5 Aug 2013 15:29:00 -0000 Received: from AUSP01VMBX28.collaborationhost.net ([192.168.20.73]) by AUSP01MHUB04.collaborationhost.net ([10.2.0.189]) with mapi; Mon, 5 Aug 2013 10:28:55 -0500 From: Keith Wright To: "user@cassandra.apache.org" Date: Mon, 5 Aug 2013 10:29:02 -0500 Subject: Re: org.apache.cassandra.io.sstable.CorruptSSTableException Thread-Topic: org.apache.cassandra.io.sstable.CorruptSSTableException Thread-Index: Ac6R8H7HzFBEC1JVQuu1pBuo5sFbsw== Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.3.120616 acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_CE253C4013FC6kwrightnaniganscom_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_CE253C4013FC6kwrightnaniganscom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Thanks for the feedback. This node actually shut down half way when it was= bootstrapping the first time which likely led to this data corruption. We= restarted the JVM and it appeared stable until this issue. We decided to = stop cassandra, wipe the node, and restart so that it can bootstrap again t= o ensure all data is "clean". From: Ben Coverston > Reply-To: "user@cassandra.apache.org" > Date: Monday, August 5, 2013 11:23 AM To: "user@cassandra.apache.org" > Subject: Re: org.apache.cassandra.io.sstable.CorruptSSTableException Also check your system log for IO Errors. Scrub may eliminate the error, bu= t even if it does work you should still run repair. This type of corruption= usually happens because of a failed or failing disk/memory. On Mon, Aug 5, 2013 at 8:44 AM, Jason Wee > wrote: you can try nodetool scrub. if it does not work, try repair then cleanup. h= ad this issue a few weeks back but our version is 1.0.x On Mon, Aug 5, 2013 at 8:12 AM, Keith Wright > wrote: Re-sending hoping to get some help. Any ideas would be much appreciated! From: Keith Wright > Date: Friday, August 2, 2013 3:01 PM To: "user@cassandra.apache.org" > Subject: org.apache.cassandra.io.sstable.CorruptSSTableException Hi all, We just added a node to our cluster (1.2.4 Vnodes) and they appear to be= running well exception I see that the new node is not making any progress = compacting one of the CF. The exception below is generated. My assumption= is that the only way to handle this is to stop the node, delete the file i= n question, restart, and run repair. Thoughts? org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOExceptio= n: dataSize of 1249463589142530 starting at 5604968 would be larger than fi= le /data/3/cassandra/data/users/global_user/users-global_user-ib-1550-Data.= db length 14017479 at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(S= STableIdentityIterator.java:168) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(S= STableIdentityIterator.java:83) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(S= STableIdentityIterator.java:69) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterat= or.next(SSTableScanner.java:177) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterat= or.next(SSTableScanner.java:152) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScann= er.java:139) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScann= er.java:36) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$De= serializer$1.runMayThrow(ParallelCompactionIterable.java:288) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.j= ava:28) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.IOException: dataSize of 1249463589142530 starting at 56= 04968 would be larger than file /data/3/cassandra/data/users/global_user/us= ers-global_user-ib-1550-Data.db length 14017479 at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(S= STableIdentityIterator.java:123) ... 9 more -- Ben Coverston DataStax -- The Apache Cassandra Company --_000_CE253C4013FC6kwrightnaniganscom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
Thanks for the feedback.=  This node actually shut down half way when it was bootstrapping the = first time which likely led to this data corruption.  We restarted the= JVM and it appeared stable until this issue.  We decided to stop cass= andra, wipe the node, and restart so that it can bootstrap again to ensure = all data is "clean".

=
From: Ben Coverston <ben.coverston@datastax.com>
Reply-To: "user@cas= sandra.apache.org" <use= r@cassandra.apache.org>
Date: Monday, August 5, 2013 11:23 AM
T= o: "user@cassandra.apa= che.org" <user@cassandr= a.apache.org>
Subject: R= e: org.apache.cassandra.io.sstable.CorruptSSTableException
Also check your system log for IO Errors. = Scrub may eliminate the error, but even if it does work you should still ru= n repair. This type of corruption usually happens because of a failed or fa= iling disk/memory.


On Mon, Aug 5, 2013 at 8:44 AM, Jason Wee <peichieh@gmail.= com> wrote:
you can try nodetool scrub. if it does not work, try repair then cleanup.= had this issue a few weeks back but our version is 1.0.x


On Mon, Aug 5, 2013 at 8:12 AM, Keith Wright <kwright@nanig= ans.com> wrote:
Re-sending hoping to get some help.  Any ideas would be much appreci= ated!

From: Keith Wright <kwright@nanigans.com>
Date: Friday, August 2, 2013 3:01 PM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: org.apache.cassandra.io.ssta= ble.CorruptSSTableException

H= i all,

   We just added a node to our cl= uster (1.2.4 Vnodes) and they appear to be running well exception I see tha= t the new node is not making any progress compacting one of the CF.  T= he exception below is generated.  My assumption is that the only way t= o handle this is to stop the node, delete the file in question, restart, and run re= pair.

Thoughts?

org.apach= e.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: dataSi= ze of 1249463589142530 starting at 5604968 would be larger than file /data/= 3/cassandra/data/users/global_user/users-global_user-ib-1550-Data.db length= 14017479
        at org.apache.cassandra.io.= sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:1= 68)
        at org.apache.cassandra.io.sstabl= e.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:83)
        at org.apache.cassandra.io.sstable.SSTab= leIdentityIterator.<init>(SSTableIdentityIterator.java:69)
=         at org.apache.cassandra.io.sstable.SSTableScann= er$KeyScanningIterator.next(SSTableScanner.java:177)
   = ;     at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanni= ngIterator.next(SSTableScanner.java:152)
      &nb= sp; at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.j= ava:139)
        at org.apache.cassandra.io.s= stable.SSTableScanner.next(SSTableScanner.java:36)
    =     at org.apache.cassandra.db.compaction.ParallelCompactionItera= ble$Deserializer$1.runMayThrow(ParallelCompactionIterable.java:288)
        at org.apache.cassandra.utils.WrappedRunnabl= e.run(WrappedRunnable.java:28)
        at jav= a.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException= : dataSize of 1249463589142530 starting at 5604968 would be larger than fil= e /data/3/cassandra/data/users/global_user/users-global_user-ib-1550-Data.d= b length 14017479
        at org.apache.cassa= ndra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterato= r.java:123)
        ... 9 more





--
Ben Coverston
DataStax -- The Apache Cassandra Company
--_000_CE253C4013FC6kwrightnaniganscom_--