Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3196434BE for ; Mon, 2 May 2011 12:40:11 +0000 (UTC) Received: (qmail 14307 invoked by uid 500); 2 May 2011 12:40:09 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 14281 invoked by uid 500); 2 May 2011 12:40:09 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 14273 invoked by uid 99); 2 May 2011 12:40:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 May 2011 12:40:09 +0000 X-ASF-Spam-Status: No, hits=1.3 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of daniel.doubleday@gmx.net designates 213.165.64.22 as permitted sender) Received: from [213.165.64.22] (HELO mailout-de.gmx.net) (213.165.64.22) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 02 May 2011 12:40:02 +0000 Received: (qmail invoked by alias); 02 May 2011 12:39:40 -0000 Received: from p578bde86.dip0.t-ipconnect.de (EHLO caladan.smeet.de) [87.139.222.134] by mail.gmx.net (mp056) with SMTP; 02 May 2011 14:39:40 +0200 X-Authenticated: #3445653 X-Provags-ID: V01U2FsdGVkX1+cJMogW16ctMu/5jXr2Za02NvH90x1wQEDldu650 QB8HkUdKu0qFgO Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: Strange corrupt sstable From: Daniel Doubleday In-Reply-To: <32520098-4B9F-43CD-8BD4-CC2AF54D04B0@gmx.net> Date: Mon, 2 May 2011 14:39:40 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <10F3D309-279A-420E-9976-33F32C713FA5@gmx.net> <1304012817157-6314218.post@n2.nabble.com> <32520098-4B9F-43CD-8BD4-CC2AF54D04B0@gmx.net> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1084) X-Y-GMX-Trusted: 0 Just for the record: The problem had nothing to do with bad memory. After some more digging = it turned out that due to a bug we wrote invalid utf-8 sequences as row = keys. In 0.6 the key tokens are constructed from string decoded bytes. = This does not happen anymore in 0.7 files. So what apparently happened = during compaction was=20 1. read sst and generate string based order rows 2. write the new file based on that order 3. read the compacted file based on raw bytes order -> crash That bug never made it to production so we are fine. =20 On Apr 29, 2011, at 10:32 AM, Daniel Doubleday wrote: > Bad =3D=3D Broken >=20 > That means you cannot rely on 1 =3D=3D 1. In such a scenario = everything can happen including data loss.=20 > That's why you want ECC mem on production servers. Our cheapo dev = boxes dont. >=20 > On Apr 28, 2011, at 7:46 PM, mcasandra wrote: >=20 >> What do you mean by Bad memory? Is it less heap size, OOM issues or = something >> else? What happens in such scenario, is there a data loss? >>=20 >> Sorry for many questions just trying to understand since data is = critical >> afterall :) >>=20 >> -- >> View this message in context: = http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Strange-c= orrupt-sstable-tp6314052p6314218.html >> Sent from the cassandra-user@incubator.apache.org mailing list = archive at Nabble.com. >=20