hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.
Date Thu, 02 Nov 2017 12:47:03 GMT
That's quite a good argument :) -- there's a difference between 
occasional verification for building confidence and full data 
verification for every, single, backup (which is how HBASE-19106 read to 
me). I still think the latter (thus, 19106, verbatim in its ask) would 
be unwieldy; however, the ability to do it ad-hoc as you describe has 

Also makes me wonder what how reusable VerifyReplication is at its core 
(I mean, it's more or less the same thing under the hood, right?).

Let's continue to hash out what we think the scope of a data 
verification "feature" should be and then get that put up on 19106. This 
is good.

On 11/1/17 11:32 PM, Andrew Purtell wrote:
> Potential adopters will absolutely want to construct for themselves a verifiable live
exercise. Tooling that lets you do that against a snapshot would be the way to go, I think.
Once you do that exercise, probably a few times, you can trust the backup solution enough
for restore into production, where verification may or may not be possible.
> A user who claims they'd rather not verify their backup solution works on account of
performance concerns shouldn't be taken seriously. (Not that you would (smile))
>> On Nov 1, 2017, at 7:55 PM, Josh Elser <elserj@apache.org> wrote:
>>> On 11/1/17 8:22 PM, Sean Busbey wrote:
>>> On Wed, Nov 1, 2017 at 7:08 PM, Vladimir Rodionov
>>> <vladrodionov@gmail.com> wrote:
>>>> There is no way to validate correctness of backup in a general case.
>>>> You can restore backup into temp table, but then what? Read rows one-by-one
>>>> from temp table and look them up
>>>> in a primary table? Won't work, because rows can be deleted or modified
>>>> since the last backup was done.
>>> This is why we have snapshots, no?
>> True, we could try to take a snapshot exactly when the backup was taken (likely,
still difficult to coordinate on an active system), but in what reality would we actually
want to do this? Most users I see are so concerned about the cost of running compactions (which
are actually making performance better!), they wouldn't take non-negligible portion of their
computing power and available space to re-instantiate their data (at least once) to make sure
a copy worked correctly.
>> We have WALs, HFiles, and some metadata we'd export in a backup right? Why not intrinsically
perform some validation that things like headers, trailers, etc still exist on the files we
exported (e.g. open file, read header, seek to end, verify trailer, etc). I feel like that's
a much more tenable solution that isn't going to have a ridiculous burden like restoring tables
of modest and above size.
>> This smells like it's really asking to verify a distcp, than verifying backups. There
is certainly something we can do to give a reasonable level of confidence that doesn't involve
reconstituting the whole thing.

View raw message