Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0B6F711280 for ; Mon, 9 Jun 2014 10:30:57 +0000 (UTC) Received: (qmail 74703 invoked by uid 500); 9 Jun 2014 10:30:54 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 74663 invoked by uid 500); 9 Jun 2014 10:30:54 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 74654 invoked by uid 99); 9 Jun 2014 10:30:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Jun 2014 10:30:54 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,T_REMOTE_IMAGE X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of colinkuo.tw@gmail.com designates 209.85.213.169 as permitted sender) Received: from [209.85.213.169] (HELO mail-ig0-f169.google.com) (209.85.213.169) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Jun 2014 10:30:51 +0000 Received: by mail-ig0-f169.google.com with SMTP id a13so3645499igq.2 for ; Mon, 09 Jun 2014 03:30:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=x7F8aNg4XkWxn2LVAXyMaShjL4pMsPuqnLw6q0XH7o4=; b=dVuEVXe+4SU0e5WpY1ZMomJOONxOz3UmjJlAEQ88xpnT2+F5Xj7cLoIBznH2HtV5rm M8bjHFVgdEr88xKMh+2WAXlyEAJhudMSszROh1rwbS3fodP55zqf+ZQQkxCweirLfDU5 YHVGxJpwR7u44irHEPfVGxtF6IMV1p80BK0i0EEnre3uUpZfUsRDtbzF86nMENACFukj OA2cEzlgu6cZt+L1XaBzPPK8rS5ijHF8nYyKyGSJ7nE1gFE8CPXhshScCAb4iXJMCEkk +Ql6iGpqF/uUX/cm48PwyHqbCk/3cDg2mRQgJGn9MpL5i+bKfrFeW5sgoTieFvuKhhKk ogvg== X-Received: by 10.50.98.100 with SMTP id eh4mr36388368igb.9.1402309826757; Mon, 09 Jun 2014 03:30:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.217.229 with HTTP; Mon, 9 Jun 2014 03:29:46 -0700 (PDT) In-Reply-To: References: From: Colin Kuo Date: Mon, 9 Jun 2014 18:29:46 +0800 Message-ID: Subject: Re: Advice on how to handle corruption in system/hints To: user@cassandra.apache.org Cc: Jeffery Griffith , Pierre Belanger Content-Type: multipart/alternative; boundary=047d7b2e15df16cf5e04fb64b385 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b2e15df16cf5e04fb64b385 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Francois, We're facing the same issue like yours. The approach we did is to 1. scrub that corrupted data file 2. repair that column family Immediately delete that corrupted files is not suggested if C* instance is running. This might be happening if bad disk or power outage. Thanks, Colin Colin Kuo about.me/ColinKuo [image: Colin Kuo on about.me] On Mon, Jun 9, 2014 at 6:11 AM, Francois Richard wrote: > Hi everyone, > > We are running some Cassandra clusters (Usually a cluster of 5 nodes > with replication factor of 3.) And at least once per day we do see some > corruption related to a specific sstable in system/hints. (We are using > Cassandra version 1.2.16 on RHEL 6.5) > > Here is an example of such exception: > > ERROR [CompactionExecutor:1694] 2014-06-08 21:37:33,267 > CassandraDaemon.java (line 191) Exception in thread > Thread[CompactionExecutor:1694,1,main] > > org.apache.cassandra.io.sstable.CorruptSSTableException: > java.io.IOException: dataSize of 8224262783474088549 starting at 50236051= 0 > would be larger than file /home/y/var/cassandra/data/syste > > m/hints/system-hints-ic-281-Data.db length 504590769 > > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIde= ntityIterator.java:167) > > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIde= ntityIterator.java:83) > > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIde= ntityIterator.java:69) > > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(S= STableScanner.java:180) > > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(S= STableScanner.java:155) > > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:1= 42) > > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:3= 8) > > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.= java:145) > > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.= java:122) > > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeItera= tor.java:96) > > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractItera= tor.java:143) > > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:= 138) > > at > org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.= java:145) > > at > org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunna= ble.java:48) > > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(Compact= ionTask.java:58) > > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(Abstrac= tCompactionTask.java:60) > > at > org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(Compac= tionManager.java:442) > > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java= :1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav= a:615) > > at java.lang.Thread.run(Thread.java:745) > > Caused by: java.io.IOException: dataSize of 8224262783474088549 starting > at 502360510 would be larger than file > /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db lengt= h > 504590769 > > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIde= ntityIterator.java:123) > > ... 23 more > > INFO [HintedHandoff:35] 2014-06-08 21:37:33,267 > HintedHandOffManager.java (line 296) Started hinted handoff for host: > 502a48cd-171b-4e83-a9ad-67f32437353a with IP: /10.210.239.190 > > ERROR [HintedHandoff:33] 2014-06-08 21:37:33,267 CassandraDaemon.java > (line 191) Exception in thread Thread[HintedHandoff:33,1,main] > > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > java.io.IOException: dataSize of 8224262783474088549 starting at 50236051= 0 > would be larger than file > /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db lengt= h > 504590769 > > at > org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(Hin= tedHandOffManager.java:441) > > at > org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(Hinte= dHandOffManager.java:282) > > at > org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffMana= ger.java:90) > > at > org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.j= ava:508) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java= :1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav= a:615) > > at java.lang.Thread.run(Thread.java:745) > > Caused by: java.util.concurrent.ExecutionException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > java.io.IOException: dataSize of 8224262783474088549 starting at 50236051= 0 > would be larger than file > /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db lengt= h > 504590769 > > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > > at java.util.concurrent.FutureTask.get(FutureTask.java:188) > > at > org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(Hin= tedHandOffManager.java:437) > > ... 6 more > > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > java.io.IOException: dataSize of 8224262783474088549 starting at 50236051= 0 > would be larger than file > /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db lengt= h > 504590769 > > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIde= ntityIterator.java:167) > > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIde= ntityIterator.java:83) > > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIde= ntityIterator.java:69) > > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(S= STableScanner.java:180) > > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(S= STableScanner.java:155) > > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:1= 42) > > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:3= 8) > > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.= java:145) > > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.= java:122) > > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeItera= tor.java:96) > > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractItera= tor.java:143) > > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:= 138) > > at > org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.= java:145) > > at > org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunna= ble.java:48) > > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(Compact= ionTask.java:58) > > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(Abstrac= tCompactionTask.java:60) > > > > Our current filesystem configuration for Cassandra: (nothing fancy =E2= =80=A6) > > > /dev/sda6 /home/y/var/cassandra/commitlog ext4 > defaults,commit=3D20,noatime,nobarrier,nodiratime 0 0 > > /dev/sda7 /home/y/var/cassandra/data ext4 > defaults,commit=3D20,data=3Dwriteback,noatime,nobarrier,nodiratime 0 0 > > > > The workaround we have right now is the following: > > > 1- delete the =E2=80=9Cguilty=E2=80=9D sstable, in this case: > /home/y/var/cassandra/data/system/hints/system-hints-ic-281* > > 2- Issue a major compaction for system/hints =E2=80=94> nodetool compact = system > hints; > > 3- Repeat for all the stables producing this issue. > > > > My biggest worry here is around the following message: > > > org.apache.cassandra.io.sstable.CorruptSSTableException: > java.io.IOException: dataSize of *8224262783474088549* starting at > 502360510 would be larger than file > /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db lengt= h > *504590769* > > > > Any clues on why this is happening ? > > > > Thanks, > > > FR > > > > > > > > --047d7b2e15df16cf5e04fb64b385 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi=C2=A0Francois,

We're facing the = same issue like yours. The approach we did is to=C2=A0

=
1. scrub that corrupted data file
2. repair that column fami= ly

Immediately delete that corrupted files is not suggested if = C* instance is running.=C2=A0
This might be happening if bad disk= or power outage.=C2=A0

Thanks,

=
Colin




On Mon, Jun 9, 2014 at 6:11 AM, Francois= Richard <frichard@yahoo-inc.com> wrote:
Hi everyone,

We are running some Cassandra clusters (Usually a cluster of 5 nodes with r= eplication factor of 3.) =C2=A0And at least once per day we do see some cor= ruption related to a specific sstable in system/hints. (We are using Cassan= dra version 1.2.16 on RHEL 6.5)

Here is an example of such exception:

ERROR [CompactionExecutor:1694] 2014-06-08 21:37:33,267 CassandraDaemon.java (line 191) Exception in thread Thr= ead[CompactionExecutor:1694,1,main]

org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOExceptio= n: dataSize of 8224262783474088549 starting at 502360510 would be larger th= an file /home/y/var/cassandra/data/syste

m/hints/system-hints-ic-281-Data.db length 504590769

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableIdent= ityIterator.<init>(SSTableIdentityIterator.java:167)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableIdent= ityIterator.<init>(SSTableIdentityIterator.java:83)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableIdent= ityIterator.<init>(SSTableIdentityIterator.java:69)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableScann= er$KeyScanningIterator.next(SSTableScanner.java:180)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableScann= er$KeyScanningIterator.next(SSTableScanner.java:155)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableScann= er.next(SSTableScanner.java:142)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableScann= er.next(SSTableScanner.java:38)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.utils.MergeIterator$Can= didate.advance(MergeIterator.java:145)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.utils.MergeIterator$Man= yToOne.advance(MergeIterator.java:122)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.utils.MergeIterator$Man= yToOne.computeNext(MergeIterator.java:96)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.google.common.collect.AbstractIterator.t= ryToComputeNext(AbstractIterator.java:143)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.google.common.collect.AbstractIterator.h= asNext(AbstractIterator.java:138)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.compaction.Compactio= nTask.runWith(CompactionTask.java:145)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.util.DiskAwareRunnab= le.runMayThrow(DiskAwareRunnable.java:48)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.utils.WrappedRunnable.r= un(WrappedRunnable.java:28)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.compaction.Compactio= nTask.executeInternal(CompactionTask.java:58)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.compaction.AbstractC= ompactionTask.execute(AbstractCompactionTask.java:60)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.compaction.Compactio= nManager$7.runMayThrow(CompactionManager.java:442)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.utils.WrappedRunnable.r= un(WrappedRunnable.java:28)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.Executors$RunnableAdapt= er.call(Executors.java:471)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.FutureTask.run(FutureTa= sk.java:262)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.ThreadPoolExecutor.runW= orker(ThreadPoolExecutor.java:1145)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.ThreadPoolExecutor$Work= er.run(ThreadPoolExecutor.java:615)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Thread.run(Thread.java:745)

Caused by: java.io.IOException: dataSize of 8224262783474088549 starting at= 502360510 would be larger than file /home/y/var/cassandra/data/system/hint= s/system-hints-ic-281-Data.db length 504590769

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableIdent= ityIterator.<init>(SSTableIdentityIterator.java:123)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 ... 23 more

=C2=A0INFO [HintedHandoff:35] 2014-06-08 21:37:33,267 HintedHandOffManager.java (line 296) Started hinted han= doff for host: 502a48cd-171b-4e83-a9ad-67f32437353a with IP: /10.210.239.190

ERROR [HintedHandoff:33] 2014-06-08 21:37:33,267 CassandraDaemon.java (line 191) Exception in thread Thr= ead[HintedHandoff:33,1,main]

java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.ap= ache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: dat= aSize of 8224262783474088549 starting at 502360510 would be larger than fil= e /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length 504590769

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.HintedHandOffManager= .doDeliverHintsToEndpoint(HintedHandOffManager.java:441)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.HintedHandOffManager= .deliverHintsToEndpoint(HintedHandOffManager.java:282)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.HintedHandOffManager= .access$300(HintedHandOffManager.java:90)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.HintedHandOffManager= $4.run(HintedHandOffManager.java:508)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.ThreadPoolExecutor.runW= orker(ThreadPoolExecutor.java:1145)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.ThreadPoolExecutor$Work= er.run(ThreadPoolExecutor.java:615)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Thread.run(Thread.java:745)

Caused by: java.util.concurrent.ExecutionException: org.apache.cassandra.io= .sstable.CorruptSSTableException: java.io.IOException: dataSize of 82242627= 83474088549 starting at 502360510 would be larger than file /home/y/var/cas= sandra/data/system/hints/system-hints-ic-281-Data.db length 504590769

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.FutureTask.report(Futur= eTask.java:122)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.FutureTask.get(FutureTa= sk.java:188)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.HintedHandOffManager= .doDeliverHintsToEndpoint(HintedHandOffManager.java:437)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 ... 6 more

Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io= .IOException: dataSize of 8224262783474088549 starting at 502360510 would b= e larger than file /home/y/var/cassandra/data/system/hints/system-hints-ic-= 281-Data.db length 504590769

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableIdent= ityIterator.<init>(SSTableIdentityIterator.java:167)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableIdent= ityIterator.<init>(SSTableIdentityIterator.java:83)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableIdent= ityIterator.<init>(SSTableIdentityIterator.java:69)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableScann= er$KeyScanningIterator.next(SSTableScanner.java:180)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableScann= er$KeyScanningIterator.next(SSTableScanner.java:155)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableScann= er.next(SSTableScanner.java:142)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.sstable.SSTableScann= er.next(SSTableScanner.java:38)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.utils.MergeIterator$Can= didate.advance(MergeIterator.java:145)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.utils.MergeIterator$Man= yToOne.advance(MergeIterator.java:122)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.utils.MergeIterator$Man= yToOne.computeNext(MergeIterator.java:96)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.google.common.collect.AbstractIterator.t= ryToComputeNext(AbstractIterator.java:143)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.google.common.collect.AbstractIterator.h= asNext(AbstractIterator.java:138)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.compaction.Compactio= nTask.runWith(CompactionTask.java:145)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.io.util.DiskAwareRunnab= le.runMayThrow(DiskAwareRunnable.java:48)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.utils.WrappedRunnable.r= un(WrappedRunnable.java:28)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.compaction.Compactio= nTask.executeInternal(CompactionTask.java:58)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.db.compaction.AbstractC= ompactionTask.execute(AbstractCompactionTask.java:60)



Our current files= ystem=C2=A0configuration for Cassandra: (nothing fancy=C2=A0=E2=80=A6)


/dev/sda6=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /home/y/var/cassandra/commitlog ext4 =C2= =A0 =C2=A0 defaults,commit=3D20,noatime,nobarrier,nodiratime =C2=A0 0 0

/dev/sda7=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /home/y/var/cassandra/data ext4 =C2=A0 = =C2=A0 defaults,commit=3D20,data=3Dwriteback,noatime,nobarrier,nodiratime = =C2=A0 0 0



The workaround we have right = now is the following:


1- =C2=A0delete t= he=C2=A0=E2=80=9Cguilty=E2=80=9D sstable, in this case:=C2=A0/home/y/var/cassandra/data/system= /hints/system-hints-ic-281*

2- Issue a major compaction for system/hints =E2=80= =94> nodetool compact system hints;

3- Repeat for all the stables producing this issue.=



My biggest worry here is around the following messa= ge:


= =C2=A0org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOEx= ception: dataSize of 8224262783474088549 starting at 502360510 would be larger than file = /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length 504590769



Any clues on why this is happening ?



Thanks,


FR






=C2=A0


--047d7b2e15df16cf5e04fb64b385--