Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E5A3E200D2B for ; Thu, 2 Nov 2017 10:16:08 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id E4310160BE5; Thu, 2 Nov 2017 09:16:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 335B31609EE for ; Thu, 2 Nov 2017 10:16:08 +0100 (CET) Received: (qmail 80284 invoked by uid 500); 2 Nov 2017 09:16:07 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 80273 invoked by uid 99); 2 Nov 2017 09:16:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Nov 2017 09:16:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 89B0B1808FC for ; Thu, 2 Nov 2017 09:16:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id bvkrS2GDh5iF for ; Thu, 2 Nov 2017 09:16:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id CCCF25F36D for ; Thu, 2 Nov 2017 09:16:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id EA817E101B for ; Thu, 2 Nov 2017 09:16:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 7A65C2415D for ; Thu, 2 Nov 2017 09:16:00 +0000 (UTC) Date: Thu, 2 Nov 2017 09:16:00 +0000 (UTC) From: "Amit Kabra (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-19106) Backup self validation for its correctness. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 02 Nov 2017 09:16:09 -0000 [ https://issues.apache.org/jira/browse/HBASE-19106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235440#comment-16235440 ] Amit Kabra commented on HBASE-19106: ------------------------------------ Thanks for adding the release version. But its very important feature IMO. Due to dynamic nature of HBase where compaction/splits/merges/flushes, etc keep happening all the time, there can arise scenarios in production where backup may miss rows , cells , etc. Or some part (hfile, manifests, etc...) of it gets corrupted / deleted due to xyz cluster issues. It would be great to know that the data we just backed up is restorable and has taken correct backup. Backups are important and needed in critical scenarios only and hence its validation is important IMO. > Backup self validation for its correctness. > ------------------------------------------- > > Key: HBASE-19106 > URL: https://issues.apache.org/jira/browse/HBASE-19106 > Project: HBase > Issue Type: Improvement > Components: backup&restore > Reporter: Amit Kabra > Priority: Major > Fix For: 2.1.0 > > > Backups are critical and if they don't work when we need them at the time of restore than they are not useful. We should do sanity test for each backup job we run that it is restorable and hence can be trusted. > A self validation feature can be added for the same to the backups where whenever a backup is run , once it finishes it will trigger a validation job that will do a sample restoration of the backed up data and will make sure that it compares well with actual data. -- This message was sent by Atlassian JIRA (v6.4.14#64029)