hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.
Date Thu, 02 Nov 2017 03:33:54 GMT
On Wed, Nov 1, 2017 at 5:08 PM, Vladimir Rodionov <vladrodionov@gmail.com>
wrote:

> There is no way to validate correctness of backup in a general case.
>
> You can restore backup into temp table, but then what? Read rows one-by-one
> from temp table and look them up



> in a primary table? Won't work, because rows can be deleted or modified
> since the last backup was done.
>
>
Replication has a verity table tool.

You can ask a cluster not delete rows.

You can read at a specific timestamp.

Or you could create backups during an extended ITBLL. When ITBLL completes,
verify it on src cluster. Create a table from the increment backups. Verify
in the restore.

Etc.

St.Ack




> Your results most of the time will be approximate: validation completed,
> found  99.5% of rows. Will this satisfies user?
>
> Offtop here. I hope feature requester will explain in corresponding JIRA
> what type of *validation* they perform and expect.
>
>
>
>
> On Wed, Nov 1, 2017 at 4:59 PM, Apekshit Sharma <appy@cloudera.com> wrote:
>
> > As for HBASE-19106, when someone says that it's fundamental, i think they
> > mean that some kind of validation that backup is correct is necessary,
> and
> > i concur.
> > Saying that something wasn't in initial feature list is hardly a
> > justification! It's not like the idea was known when initial list was
> > planned and was decided not to be done. It's new. And new things can be
> > important!
> >
> >
> >
> >
> > On Wed, Nov 1, 2017 at 4:34 PM, Apekshit Sharma <appy@cloudera.com>
> wrote:
> >
> > > Came here just to track anything related to Distributed Log Replay
> which
> > I
> > > am trying to purge. But looks like it's another discussion thread about
> > > hbase-backup.
> > > Am coming here with limited knowledge about the feature (did a review
> > > initially once, lost track after). But then, looks like discussion is
> not
> > > about technical aspects of feature, but trust in it.
> > >
> > > Something which can help get trust in B&R, or otherwise, is an accurate
> > > summary of it as of now. Basically
> > > 1) What features are there in 2.0
> > > 2) What features are being targeted for 2.1 onwards
> > > 3) What testing has been done so far. Not just names...details. For eg.
> > > ITBLL w/ 50 node cluster and x,y,z fault tolerences.
> > > 4) What tests are planned before 2.0. I think a good basis to judge
> that
> > > would be, will that testing convince Elliot/ Andrew to use that feature
> > in
> > > their internal clusters.
> > > 5) List of existing bugs
> > >
> > > Once it's there, hopefully everyone agrees that list in (1) is enough
> and
> > > items in (2) are non-critical for basic B&R.
> > > 3 and 4 are most important.
> > > Missing anything in (5) will be counter-productive.
> > > I'd appreciate if the summary is followed by opinions, and not mixed
> > > together.
> > >
> > > Just a suggestion which can help you get right attention.
> > > Thanks.
> > >
> > > -- Appy
> > >
> > >
> > >
> > > On Wed, Nov 1, 2017 at 3:33 PM, Vladimir Rodionov <
> > vladrodionov@gmail.com>
> > > wrote:
> > >
> > >> >> HBASE-19106 at least is a fundamental
> > >>
> > >> This new feature was requested 9 days ago (between alpha 3 and alpha 4
> > >> releases) It has never been on a list of features we has agreed to
> > >> implement for 2.0 release.
> > >> When backup started almost 2 years ago, we described what features and
> > >> capabilities will be implemented. We have had a discussions before
> and I
> > >> do
> > >> not remember any
> > >> complaints from community that we lack important functionalities
> > >>
> > >> You can not point to it as a blocker for 2.0 release, Stack.
> > >>
> > >> Testing at scale (lack of) - the only real issue I see in B&R now.
The
> > >> question: can it justify your willingness to postpone feature till
> next
> > >> 2.x release, Stack?
> > >>
> > >> All blockers are resolved, including pending HBASE-17852 patch. All
> > >> functionality for 2.0  has been implemented.   Scalability and
> > performance
> > >> improvements patch is in working
> > >> and expected to be ready next week. In any case, this is improvement -
> > not
> > >> a new feature.
> > >>
> > >> We have been testing B&R in our internal QA clusters for months.
> Others
> > >> (SF) have done testing as well. I am pretty confident in
> implementation.
> > >>
> > >>
> > >>
> > >> On Wed, Nov 1, 2017 at 3:15 PM, Josh Elser <elserj@apache.org> wrote:
> > >>
> > >> > On 11/1/17 5:52 PM, Stack wrote:
> > >> >
> > >> >> On Wed, Nov 1, 2017 at 12:25 PM, Vladimir Rodionov<
> > >> vladrodionov@gmail.com
> > >> >> >
> > >> >> wrote:
> > >> >>
> > >> >> 1. HBASE-19104 - 19109
> > >> >>>
> > >> >>> None of them are basic, Stack. These requests came from SF
after
> > >> >>> discussion
> > >> >>> we had with them recently
> > >> >>> No single comments is because I was out of country last week.
> > >> >>>
> > >> >>> 2. Backup tables are not system ones, they belong to a separate
> > >> >>> namespace -
> > >> >>> "backup"
> > >> >>>
> > >> >>> 3. We make no assumptions on assignment order of these tables.
> > >> >>>
> > >> >>> As for real scale testing and documentation , we still have
time
> > >> before
> > >> >>> 2.0GA.  Can't be blocker IMO
> > >> >>>
> > >> >>>
> > >> >>> First off, wrong response.
> > >> >>
> > >> >> Better would have been pointers to a description of the feature
as
> it
> > >> >> stands in branch-2 (a list of JIRAs is insufficient), what is
to be
> > >> done
> > >> >> still, and evidence of heavy testing in particular at scale (as
> Josh
> > >> >> reminds us, we agreed to last time backup-in-hbase2 was broached)
> > >> ending
> > >> >> with list of what will be done between here and beta-1 to assuage
> any
> > >> >> concerns that backup is incomplete. As to the issues filed, IMO,
> > >> >> HBASE-19106 at least is a fundamental. W/o it, how you even know
> > backup
> > >> >> works at anything above toy scale.
> > >> >>
> > >> >> Pardon my mistake on 'system' tables. I'd made the statement 9
days
> > >> ago up
> > >> >> in HBASE-17852 trying to figure what was going on in the issue
and
> it
> > >> >> stood
> > >> >> unchallenged (Josh did let me know later that you were traveling).
> > >> >>
> > >> >> I'm not up for waiting till GA before we decide what is in the
> > release.
> > >> >> This DISCUSSION is about deciding now, before beta-1, whats in
and
> > >> whats
> > >> >> out. Backup would be a great to have but it is currently on the
> > >> chopping
> > >> >> block. I've tried to spend time figuring what is there and where
it
> > >> stands
> > >> >> but I always end up stymied (e.g. see HBASE-17852; see how it
> starts
> > >> out;
> > >> >> see the patch attached w/ no description of what it comprises
or
> the
> > >> >> approach decided upon; and so on). Maybe its me, but hey,
> > >> unfortunately,
> > >> >> its me who is the RM.
> > >> >>
> > >> >
> > >> > As much as it pains me, I can't argue with the lack of confidence
> via
> > >> > testing. While it feels like an eternity ago since we posited on
> B&R's
> > >> > scale/correctness testing, it's only been 1.5 months. In reality,
> > >> getting
> > >> > to this was delayed by some of the (really good!) FT fixes that Vlad
> > has
> > >> > made.
> > >> >
> > >> > We set the bar for the feature and we missed it; there's not arguing
> > >> that.
> > >> > Yes, it stinks. I see two paths forward: 1) come up with its own
> > >> release to
> > >> > let those downstream use it now (risks withstanding) or 2) shoot for
> > >> HBase
> > >> > 2.1.0. The latter is how we've approached this in the past. Building
> > the
> > >> > test needs to happen regardless of the release vehicle.
> > >> >
> > >> > New issues/feature-requests are always going to come in as people
> > >> > experiment with it. I hope to avoid getting bogged down in this --
I
> > >> > sincerely doubt that there is any single answer to what is
> "required"
> > >> for
> > >> > an initial backup and restore implementation. I feel like anything
> > more
> > >> > will turn into a battle of opinions. When we bring up the feature
> > >> again, we
> > >> > should make a concerted effort to say "this is the state of the
> > feature,
> > >> > with the design choices made, and this the result of our testing for
> > >> > correctness." Hopefully much of this is already contained in
> > >> documentation
> > >> > and just needs to be collected/curated.
> > >> >
> > >> > - Josh
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > >
> > > -- Appy
> > >
> >
> >
> >
> > --
> >
> > -- Appy
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message