Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 03C73200C31 for ; Wed, 8 Mar 2017 19:45:36 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 02589160B83; Wed, 8 Mar 2017 18:45:36 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0162B160B73 for ; Wed, 8 Mar 2017 19:45:34 +0100 (CET) Received: (qmail 67247 invoked by uid 500); 8 Mar 2017 18:45:33 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 67237 invoked by uid 99); 8 Mar 2017 18:45:33 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Mar 2017 18:45:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id CD5FEC002B for ; Wed, 8 Mar 2017 18:45:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id josboDnkFYm4 for ; Wed, 8 Mar 2017 18:45:30 +0000 (UTC) Received: from mail-qk0-f182.google.com (mail-qk0-f182.google.com [209.85.220.182]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id CD7BD5F1EE for ; Wed, 8 Mar 2017 18:45:29 +0000 (UTC) Received: by mail-qk0-f182.google.com with SMTP id v125so79509482qkh.2 for ; Wed, 08 Mar 2017 10:45:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=9w8Q7afiG07KMSuvHmZJM8ZVc4HedGbm9YEQ5hrdjlE=; b=QwUvet0cwmNTLW00lO2nxupCsJcfsL53flc/4j3j4bK3BSvSr1JnDuLHS0J7V2/dlY gVJBeZ0Qd4YnLruWgDvWPL+b8OCuyVbJzCc55fGCLgpHdL6DWsreowj8Y03Bipk8L47d AXWRX7GdwDNZufab63Pz9j7T5tKL16jPhcajDvBFmuhjX+AUBiNi8mCy/nft75SXQzCh JVt3NSRrfAejf1MmTo7j1/PUhLYaMdIcdBPiOXDhv6QLMGAGKvw73g6C5eK/hJWNdbsq MkSoq9ISL/AaOeiD16lDcmVK2blwo0uqX2QD6KyBTz4tRwEaZ8CVTlCECyirATaNPzMp b6Bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=9w8Q7afiG07KMSuvHmZJM8ZVc4HedGbm9YEQ5hrdjlE=; b=tu7zAsAV8bWj76Zo3ubUuK8oVCAG6zxM9ACwS3LVmqzb3mfOJ49i5Iklo+kLGIV0bh C2ACy8eZNWuLS46mk/wAltaeF2a9WV19KYZvogdQ7yE3zoYyT0oPVis8cO801j80CDhp +VDSWyF14DlpixZFy/U55eealz1oJHyWDiDyKiYvGb5HWXbTzY+LQdNirIQeBX7aiysp 0Fgg/bkIqGT2xTu0EzZLdc4+6DuqbISKgTD3QZNxZ/CA6CSXSyrSUV5Xq9GFwDMbRzdu VWWrs1KLQ40hKZ/IjMPwoUKxbYCexpBf1EP+CBGIwsKRDL/2GRk/tRrT5WbAHyLyhcW6 vGag== X-Gm-Message-State: AMke39mXmh5K06K+Rt6OiQG14XvBj5IqxsNgLb3VAyoG0tlqbK8nke/pxvWnTm9jHUNYOTswj9ICmf32YPiMZg== X-Received: by 10.55.64.139 with SMTP id n133mr8855256qka.38.1488998725516; Wed, 08 Mar 2017 10:45:25 -0800 (PST) MIME-Version: 1.0 References: <173319751.1622558.1488977720789.ref@mail.yahoo.com> <173319751.1622558.1488977720789@mail.yahoo.com> In-Reply-To: From: Eric Stevens Date: Wed, 08 Mar 2017 18:45:14 +0000 Message-ID: Subject: Re: Is it possible to recover a deleted-in-future record? To: user@cassandra.apache.org, "anujw_2003@yahoo.co.in" Content-Type: multipart/alternative; boundary=001a114897b41b0479054a3c8823 archived-at: Wed, 08 Mar 2017 18:45:36 -0000 --001a114897b41b0479054a3c8823 Content-Type: text/plain; charset=UTF-8 Those future tombstones are going to continue to cause problems on those partitions. If you're still writing to those partitions, you might be losing data in the mean time. It's going to be hard to get the tombstone out of the way so that new writes can begin to happen there (newly written data will be occluded by the existing tombstones). Manual cleanup might be required here, such sstablefilter or sstable2json->clean up the data -> json2sstable. This could get really hairy. Another option, depending on the kind of tombstone they were (eg cell level), my deleting compactor[1] might be able to clean them up on the live cluster via user defined compaction if you wrote a convictor for this purpose. But that tool has a gap for cluster and/or partition level tombstones which it doesn't properly recognize yet (there's an open PR that provides partial implementation, but I'm not sure it would get you what you need). You can see my talk about that[2]. Careful caveat on this though, the deleting compactor was written to _avoid_ tombstones, it hasn't been well tested against data that contains tombstones, so although time is critical for you here to avoid ongoing corruption of your data while those bad tombstones remain in the way, I would still fully encourage you to validate whether this could satisfy your use case. [1] https://github.com/protectwise/cassandra-util [2] https://www.youtube.com/watch?v=BhGkSnBZgJA On Wed, Mar 8, 2017 at 6:06 AM Arvydas Jonusonis < arvydas.jonusonis@gmail.com> wrote: > That's a good point - a snapshot is certainly in order ASAP, if not > already done. > > One more thing I'd add about "data has to be consolidated from all the > nodes" (from #3 below): > > - EITHER run the sstable2json ops on each node > - OR if size permits, copy the relevant sstables (containing the > desired keys, from the output of the nodetool getsstables) locally or onto > a new single-node instance, start that instance and run the commands there > > If restoring the sstables from a snapshot, you'll need to do the latter > anyway. > > Arvydas > > On Wed, Mar 8, 2017 at 1:55 PM, Anuj Wadehra > wrote: > > DISCLAIMER: This is only my personal opinion. Evaluate the situation > carefully and if you find below suggestions useful, follow them at your own > risk. > > If I have understood the problem correctly, malicious deletes would > actually lead to deletion of data. I am not sure how everything is normal > after the deletes? > > If data is critical,you could: > > 1. Take a database snapshot immediately so that you dont lose information > if delete entrues in sstables are compacted together with original data. > > 2. Transfer snapshot to suitable place and Run some utility such as > sstabletojson to get the keys impacted by the deletes and original data for > keys. Data has to be consolidated from all the nodes. > > 3. Devise a strategy to restore deleted data. > > Thanks > Anuj > > > > On Tue, Mar 7, 2017 at 8:44 AM, Michael Fong > wrote: > > Hi, all, > > > > > > We recently encountered an issue in production that some records were > mysteriously deleted with a timestamp 100+ years from now. Everything is > normal as of now, and how the deletion happened and accuracy of system > timestamp at that moment are unknown. We were wondering if there is a > general way to recover the mysteriously-deleted data when the timestamp > meta is screwed up. > > > > Thanks in advanced, > > > > Regards, > > > > Michael Fong > > > --001a114897b41b0479054a3c8823 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Those future tombstones are going to continue to cause pro= blems on those partitions.=C2=A0 If you're still writing to those parti= tions, you might be losing data in the mean time.=C2=A0 It's going to b= e hard to get the tombstone out of the way so that new writes can begin to = happen there (newly written data will be occluded by the existing tombstone= s).=C2=A0 Manual cleanup might be required here, such sstablefilter or ssta= ble2json->clean up the data -> json2sstable.=C2=A0 This could get rea= lly hairy. =C2=A0

Another option, depending on the kind = of tombstone they were (eg cell level), my deleting compactor[1] might be a= ble to clean them up on the live cluster via user defined compaction if you= wrote a convictor for this purpose.=C2=A0 But that tool has a gap for clus= ter and/or partition level tombstones which it doesn't properly recogni= ze yet (there's an open PR that provides partial implementation, but I&= #39;m not sure it would get you what you need).=C2=A0 You can see my talk a= bout that[2].

Careful caveat on this though, the d= eleting compactor was written to _avoid_ tombstones, it hasn't been wel= l tested against data that contains tombstones, so although time is critica= l for you here to avoid ongoing corruption of your data while those bad tom= bstones remain in the way, I would still fully encourage you to validate wh= ether this could satisfy your use case.

[1]=C2=A0<= a href=3D"https://github.com/protectwise/cassandra-util">https://github.com= /protectwise/cassandra-util=C2=A0

On Wed, Mar 8, 2017 at 6:06 AM Arvydas Jonusonis <arvydas.jonusonis@gmail.com> wrote:
= That's a good point - a snapshot is certainly in order ASAP, if not alr= eady done.

One more thing I'd add about "data has to be consol= idated from all the nodes" (from #3 below):
  • EITHER run the sstable2= json ops on each node
  • OR if size permits, copy = the relevant sstables (containing the desired keys, from the output of the = nodetool getsstables) locally or onto a new single-node instance, start tha= t instance and run the commands there
If restoring the sstables from a snapshot, you'll = need to do the latter anyway.

Arvydas

On Wed, Mar 8, 2017 at 1:55 PM, Anuj Wadehra <anujw_2003@yahoo.co.in> wrote:
=
DISCLAIMER: This is only my per= sonal opinion. Evaluate the situation carefully and if you find below sugge= stions useful, follow them at your own risk.

If I have understood = the problem correctly, malicious deletes would actually lead to deletion of= data.=C2=A0 I am not sure how everything is normal after the deletes?

If data is critical,you could:

1. Take a = database snapshot immediately so that you dont lose information if delete e= ntrues in sstables are compacted together with original data.=C2=A0

= 2. Transfer snapshot to suitable place and Run some utility such as sstable= tojson to get the keys impacted by the deletes and original data for keys. = Data has to be consolidated from all the nodes.

3. Devise a strategy= to restore deleted data.

Thanks
Anuj


<= br class=3D"gmail_msg">
On Tue, Mar = 7, 2017 at 8:44 AM, Michael Fong
=20

Hi, all,

=20

=C2=A0

=20

=C2=A0

=20

We recently encountered an issue in produc= tion that some records were mysteriously deleted with a timestamp 100+ year= s from now. Everything is normal as of now, and how the deletion happened a= nd accuracy of system timestamp at that moment are unknown. We were wondering if there is a general way to recover the my= steriously-deleted data when the timestamp meta is screwed up.

=20

=C2=A0

=20

Thanks in advanced,

=20

=C2=A0

=20

Regards,

=20

=C2=A0

=20

Michael Fong

=20

--001a114897b41b0479054a3c8823--