From user-return-6286-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue Jun 01 13:47:28 2010 Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 36168 invoked from network); 1 Jun 2010 13:47:28 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Jun 2010 13:47:28 -0000 Received: (qmail 67956 invoked by uid 500); 1 Jun 2010 13:47:26 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 67932 invoked by uid 500); 1 Jun 2010 13:47:26 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 67924 invoked by uid 99); 1 Jun 2010 13:47:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Jun 2010 13:47:26 +0000 X-ASF-Spam-Status: No, hits=4.4 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of hive13@gmail.com designates 74.125.83.172 as permitted sender) Received: from [74.125.83.172] (HELO mail-pv0-f172.google.com) (74.125.83.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Jun 2010 13:47:21 +0000 Received: by pvh11 with SMTP id 11so857540pvh.31 for ; Tue, 01 Jun 2010 06:47:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=9TTmptDw0z0QyUO6KuvvZL0ZjL/SJ1i1PYeOdpNtNIw=; b=UqPp+oePCtnOhcmCF1+Os94XPE3sgxeDzbDM3wT4DKzjdINBRTx4x6zNGAVyJJPDks 4P9KgSUxt2kv82gtOG++Gc03kLLTO5TGNm0TCukUPV4+QGhIHAzil4wiG1qvf16Fp59G 1o6KDaPkb1VJm4E5Pq6aS8CS2iMPnqQKR4mzw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=q413/BguxT7mCCAaRdPzNs6KnWuPBn5vdJDgV2q6Az8gK5l4fx8mWUydFbO4Y42RiY +OU7Yat5WBYLF067e2zg/QTA3BXZWOumOsfXs4s/CpEIwjBLXQgsV4UXTmsW7K6sfbTu 8KDTQwOtKokbg80QXGM3CIPkrH+AEiaSH6zyU= MIME-Version: 1.0 Received: by 10.115.65.34 with SMTP id s34mr4853807wak.143.1275400019409; Tue, 01 Jun 2010 06:46:59 -0700 (PDT) Received: by 10.114.36.5 with HTTP; Tue, 1 Jun 2010 06:46:59 -0700 (PDT) In-Reply-To: References: Date: Tue, 1 Jun 2010 21:46:59 +0800 Message-ID: Subject: Re: Skipping corrupted rows when doing compaction From: hive13 Wong To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0016e64dbc641a7ced0487f83597 --0016e64dbc641a7ced0487f83597 Content-Type: text/plain; charset=ISO-8859-1 Thanks, Jonathan I'm using 0.6.1 And another thing is that I get lots of zero-sized tmp files in the data directory. When I restarted cassandra those tmp files will be deleted then new empty tmp files will be generated gradually, while still lots of UTFDataFormatException in the system.log Using 0.6.2 and DiskAccessMode=standard will skip corrupted rows? On Tue, Jun 1, 2010 at 9:08 PM, Jonathan Ellis wrote: > If you're on a version earlier than 0.6.1, you might be running into > https://issues.apache.org/jira/browse/CASSANDRA-866. Upgrading will > fix it, you don't need to reload data. > > It's also worth trying 0.6.2 and DiskAccessMode=standard, in case > you've found another similar bug. > > On Tue, Jun 1, 2010 at 7:37 AM, hive13 Wong wrote: > > Hi, > > Is there a way to skip corrupted rows when doing compaction? > > We are currently deploying 2 nodes with replicationfactor=2 but one node > > reports lots of exceptions like java.io.UTFDataFormatException: malformed > > input around byte 72. My guess is that some of the data in the SSTable is > > corrupted but not all because I can still read data out of the related CF > > but for some keys. > > It's OK for us to throw away a small portion of the data to get the nodes > > working normal. > > If there is no such way to skip corrupted rows can I just clean all the > data > > in the corrupted node and then add it back to the cluster? > > Will it automatically migrating data from the other node? > > Thanks. > > Ivan > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > --0016e64dbc641a7ced0487f83597 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Thanks,=A0Jonathan=A0

I'm using 0.6.1
And another thing is that I get lots o= f zero-sized tmp files in the data directory.
When I restarted ca= ssandra those tmp files will be deleted then new empty tmp files will be ge= nerated gradually, while still lots of=A0UTFDataFormatException in the syst= em.log

Using 0.6.2 and DiskA= ccessMode=3Dstandard will skip corrupted rows?

On Tue, Jun 1, 2010 at 9:08 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex;"> If you're on a version earlier than 0.6.1, you might be running into https://issues.apache.org/jira/browse/CASSANDRA-866. =A0Upgradin= g will
fix it, you don't need to reload data.

It's also worth trying 0.6.2 and DiskAccessMode=3Dstandard, in case
you've found another similar bug.

On Tue, Jun 1, 2010 at 7:37 AM, hive13 Wong <hive13@gmail.com> wrote:
> Hi,
> Is there a way to skip corrupted rows when doing compaction?
> We are currently deploying 2 nodes with replicationfactor=3D2 but one = node
> reports lots of exceptions like java.io.UTFDataFormatException: malfor= med
> input around byte 72. My guess is that some of the data in the SSTable= is
> corrupted but not all because I can still read data out of the related= CF
> but for some keys.
> It's OK for us to throw away a small portion of the data to get th= e nodes
> working normal.
> If there is no such way to skip corrupted rows can I just clean all th= e data
> in the corrupted node and then add it back to the cluster?
> Will it automatically migrating data from the other node?
> Thanks.
> Ivan



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

--0016e64dbc641a7ced0487f83597--