Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 65ED410D3F for ; Mon, 3 Feb 2014 08:51:51 +0000 (UTC) Received: (qmail 16268 invoked by uid 500); 3 Feb 2014 08:51:48 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 15639 invoked by uid 500); 3 Feb 2014 08:51:47 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 15625 invoked by uid 99); 3 Feb 2014 08:51:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Feb 2014 08:51:47 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of olek.stasiak@gmail.com designates 209.85.220.179 as permitted sender) Received: from [209.85.220.179] (HELO mail-vc0-f179.google.com) (209.85.220.179) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Feb 2014 08:51:41 +0000 Received: by mail-vc0-f179.google.com with SMTP id lh14so4609172vcb.10 for ; Mon, 03 Feb 2014 00:51:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=7/TXys62AvACtqd9Xx9W9JF+VQTBlkFMebEc4FtbVXU=; b=ea/CAIhtS5kECw/gGY1gxfKzmS3LaN6ZtREtsfh6trBPXQBJWsbBblwoOyBPerIx70 xA50ecrk/IrDHy/wn41oo2fVbotWJb2b60qNxQ4FB2k1g26Fzig0W+O1QZXVI3YZ87/1 kwh4BIj5TmSsDB4WX94CC0xVP5we6BK+UTP333tIOoYf5aFDaDkO/3zHlaPP23KG/aM/ fH2mM8dDmdtX1ZxzNdW16sPhh/OBTRoP9oqXNvXUyEZq01lGcviBDNDQm5L17NiyF9m6 yBYwDSYXbfIR8c93dVrRJqhwuNBUYrarEEnea52Z49EC2N2xv7H+nRNxVfCqVsQPhL1Q 83JQ== MIME-Version: 1.0 X-Received: by 10.52.120.81 with SMTP id la17mr67645vdb.44.1391417480721; Mon, 03 Feb 2014 00:51:20 -0800 (PST) Received: by 10.220.203.133 with HTTP; Mon, 3 Feb 2014 00:51:20 -0800 (PST) In-Reply-To: References: Date: Mon, 3 Feb 2014 09:51:20 +0100 Message-ID: Subject: Re: Data tombstoned during bulk loading 1.2.10 -> 2.0.3 From: "olek.stasiak@gmail.com" To: Emaillist for cass users Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Hi All, We've faced very similar effect after upgrade from 1.1.7 to 2.0 (via 1.2.10). Probably after upgradesstable (but it's only a guess, because we noticed problem few weeks later), some rows became tombstoned. They just disappear from results of queries. After inverstigation I've noticed, that they are reachable via sstable2json. Example output for "non-existent" row: {"key": "6e6e37716c6d665f6f61695f6463","metadata": {"deletionInfo": {"markedForDeleteAt":2201170739199,"localDeletionTime":0}},"columns": [["DATA","3c6f61695f64633a64(...)",1357677928108]]} ] If I understand correctly row is marked as deleted with timestamp in the far future, but it's still on the disk. Also localDeletionTime is set to 0, which may means, that it's kind of internal bug, not effect of client error. So my question is: is it true, that upgradesstable may do soemthing like that? How to find reasons for such strange cassandra behaviour? Is there any option of recovering such strange marked nodes? This problem touches about 500K rows of all 14M in our database, so the percentage is quite big. best regards Aleksander 2013-12-12 Robert Coli : > On Wed, Dec 11, 2013 at 6:27 AM, Mathijs Vogelzang > wrote: >> >> When I use sstable2json on the sstable on the destination cluster, it has >> "metadata": {"deletionInfo": >> {"markedForDeleteAt":1796952039620607,"localDeletionTime":0}}, whereas >> it doesn't have that in the source sstable. >> (Yes, this is a timestamp far into the future. All our hosts are >> properly synced through ntp). > > > This seems like a bug in sstableloader, I would report it on JIRA. > >> >> Naturally, copying the data again doesn't work to fix it, as the >> tombstone is far in the future. Apart from not having this happen at >> all, how can it be fixed? > > > Briefly, you'll want to purge that tombstone and then reload the data with a > reasonable timestamp. > > Dealing with rows with data (and tombstones) in the far future is described > in detail here : > > http://thelastpickle.com/blog/2011/12/15/Anatomy-of-a-Cassandra-Partition.html > > =Rob >