Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7CB78200BCE for ; Fri, 2 Dec 2016 19:19:34 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 7B86B160B24; Fri, 2 Dec 2016 18:19:34 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9AB10160B08 for ; Fri, 2 Dec 2016 19:19:33 +0100 (CET) Received: (qmail 35323 invoked by uid 500); 2 Dec 2016 18:19:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 35310 invoked by uid 99); 2 Dec 2016 18:19:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Dec 2016 18:19:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 973121803A3 for ; Fri, 2 Dec 2016 18:19:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.699 X-Spam-Level: * X-Spam-Status: No, score=1.699 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_REPLYTO_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.co.in Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id FJo6C9efRHdX for ; Fri, 2 Dec 2016 18:19:26 +0000 (UTC) Received: from nm31-vm9.bullet.mail.sg3.yahoo.com (nm31-vm9.bullet.mail.sg3.yahoo.com [106.10.151.200]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 7B59C5FAE0 for ; Fri, 2 Dec 2016 18:19:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.co.in; s=s2048; t=1480702757; bh=xExtKHDRwDxWmXF7OfoYumtVdVKVpN4D7r7HLUHb8fA=; h=Date:From:Reply-To:To:Subject:References:From:Subject; b=as9oHdoKTP2X9WfovFIq/u80Te6VX3lNq92gAf4gsT7v5+M/ZB4dTQmykGDWlOl/+mYvcrZY17coAbzSvPCxekQv2t0HCg1uDdYiVjZ0fiMk4wHGehpDPUUE79kWdR9irlDwJcuCWfTXVYR7z5sIP4CQCbLQt3qyWPwD8h365+lMCL4F+Aql3avX2/8pIqh98I1Icpe6TznA8ioo07VB6F0sy2NTGsHtpRR2x5PiWRRZ1Ke4f/U6aOdse8kSbZ9955MfBzrYbvQ6akFspNMYDaUaJdPz5+ZU+BYwccqVpunh5SPwhnYi85QVhU8ISRqXKaKcuJ/ChH31ZumJumGdRg== Received: from [106.10.166.118] by nm31.bullet.mail.sg3.yahoo.com with NNFMP; 02 Dec 2016 18:19:17 -0000 Received: from [106.10.151.155] by tm7.bullet.mail.sg3.yahoo.com with NNFMP; 02 Dec 2016 18:19:17 -0000 Received: from [127.0.0.1] by omp1009.mail.sg3.yahoo.com with NNFMP; 02 Dec 2016 18:19:17 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 61581.19978.bm@omp1009.mail.sg3.yahoo.com X-YMail-OSG: L4I4UZ4VM1mKR8XVGThIfAvdr0nc9LQUhzTGkZ984B0amNlartGH2Htrj6GO_Hm WUR22Nwxt1evSZZE7Ouk9ciqLZ6mIrLSgrpU_kt2KtFPa9O1rZi52hWryj3rwQsIP29uoaTvv_qS UESUhGKw487dY.N9x9wWrG3mawL5xc_fTOaZUHdhmTDZ2.YQnukGqsDlQNHkv8DkbdAZFKxZwtu9 47nDwEi57ObndQrBUFDUpakKnZaVXO_DEsDVkpXqSvunFHH874k8Du_WMhoGTyc2Z.YMoxO..NO6 HJq5lA44nLKWf2TZSHgDWjodygPTTjftdtq2FRa1J7sMT3VsIU8JAoqEJp_wmq4MUf.KILimKRF5 9b_n03XYv2Ix_WfdUHHNcWGH_7uV8YA1loKlwLWYrJkqEI9FVnuqx4kx9vW_G6OM3eYwRlUxHddX 7TWYzr13PS5LORTgIJQS8yRRN5krf92i2ewelc9pqtv6NUaNpC2ZzpCHoRVyUmpX34KezDMhDaTN 8KUFjCVqcsLrBqiBX0RM7AKyu Received: from jws600050.mail.sg3.yahoo.com by sendmailws107.mail.sg3.yahoo.com; Fri, 02 Dec 2016 18:19:16 +0000; 1480702756.702 Date: Fri, 2 Dec 2016 18:19:15 +0000 (UTC) From: Anuj Wadehra Reply-To: "anujw_2003@yahoo.co.in" To: "user@cassandra.apache.org" Message-ID: <2100223211.5426044.1480702755902@mail.yahoo.com> Subject: Re: Single cluster node restore MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_5426043_995537969.1480702755900" References: <2100223211.5426044.1480702755902.ref@mail.yahoo.com> archived-at: Fri, 02 Dec 2016 18:19:34 -0000 ------=_Part_5426043_995537969.1480702755900 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Petr, If data corruption means accidental data deletions via Cassandra commands, = you have to restore entire cluster with latest snapshots. This may lead to = data loss as there may be valid updates after the snapshot was taken but be= fore the data deletion. Restoring single node with snapshot wont help as Ca= ssandra replicated the accidental deletes to all nodes. If data corruption means accidental deletion of some sstable files from fil= e system of a node, repair would fix it. If data corruption means unreadable data due to hardware issues etc, you wi= ll have two options after replacing the disk: bootstrap or restore snapshot= on the single affected node. If you have huge data per node e.g. 300Gb , y= ou may want to restore from Snapshot followed by repair. Restoring snapshot= on single node is faster than streaming all data via bootstrap.=C2=A0If th= e node is not recoverable and must be replaced, you should be able to do au= to-boostrap or restore from snapshot with auto-bootstrap set to false. I ha= vent replaced a dead node with snapshot but there should not be any issues = as token ranges dont change when you replace a node. Thanks Anuj=20 =20 On Tue, 29 Nov, 2016 at 11:08 PM, Petr Malik wrote: = =20 Hi. I have a question about Cassandra backup-restore strategies. As far as I understand Cassandra has been designed to survive hardware fail= ures by relying on data replication. It seems like people still want backup/restore for case when somebody accid= entally deletes data or the data gets otherwise corrupted. In that case restoring all keyspace/table snapshots on all nodes should bri= ng it back. I am asking because I often read directions on restoring a single node in a= cluster. I am just wondering under what circumstances could this be done s= afely. Please correct me if i am wrong but restoring just a single node does not r= eally roll back the data as the newer (corrupt) data will be served by othe= r replicas and eventually propagated to the restored node. Right? In fact by doing so one may end up reintroducing deleted data back... Also since Cassandra distributes the data throughout the cluster it is not = clear on which mode any particular (corrupt) data resides and hence which t= o restore. I guess this is a long way of asking whether there is an advantage of tryin= g to restore just a single node in a Cassandra cluster as opposed to say re= placing the dead node and letting Cassandra handle the replication. Thanks. =20 ------=_Part_5426043_995537969.1480702755900 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Petr,

If data corruption means a= ccidental data deletions via Cassandra commands, you have to restore entire= cluster with latest snapshots. This may lead to data loss as there may be = valid updates after the snapshot was taken but before the data deletion. Re= storing single node with snapshot wont help as Cassandra replicated the acc= idental deletes to all nodes.

If data corruption means accidental deletion of some sstable fil= es from file system of a node, repair would fix it.

If data corruption means unreadable data due to hard= ware issues etc, you will have two options after replacing the disk: bootst= rap or restore snapshot on the single affected node. If you have huge data = per node e.g. 300Gb , you may want to restore from Snapshot followed by rep= air. Restoring snapshot on single node is faster than streaming all data vi= a bootstrap. 
If the node is not recoverable and must be replaced, you should be able t= o do auto-boostrap or restore from snapshot with auto-bootstrap set to fals= e. I havent replaced a dead node with snapshot but there should not be any = issues as token ranges dont change when you replace a node.



Thanks
An= uj

On Tue, 29 Nov, 2016 a= t 11:08 PM, Petr Malik
<pmalik@tesora.com> wrote:
=20


Hi.

I have a question about Cassandra backup-restore strategies.

As far as I understand Cassandra has been designed to surviv= e hardware failures by relying on data replication.


It seems like people still want backup/restore for case when= somebody accidentally deletes data or the data gets otherwise corrupted.

In that case restoring all keyspace/table snapshots on all n= odes should bring it back.


I am asking because I often read directions on restoring a s= ingle node in a cluster. I am just wondering under what circumstances could= this be done safely.


Please correct me if i am wrong but restoring just a single = node does not really roll back the data as the newer (corrupt) data will be= served by other replicas and eventually propagated to the restored node. R= ight?

In fact by doing so one may end up reintroducing deleted dat= a back...


Also since Cassandra distributes the data throughout the clu= ster it is not clear on which mode any particular (corrupt) data resides an= d hence which to restore.


I guess this is a long way of asking whether there is an adv= antage of trying to restore just a single node in a Cassandra cluster as op= posed to say replacing the dead node and letting Cassandra handle the repli= cation.


Thanks.

------=_Part_5426043_995537969.1480702755900--