hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Online snapshots progress.
Date Fri, 14 Dec 2012 17:37:57 GMT
Thanks for the update, Jon.

bq. if splits or balancing occurs while a snapshotting, the region moves
cause the final snapshot verification step to abort

The split or balancing happened during snapshot verification step, right ?

On Fri, Dec 14, 2012 at 9:17 AM, Jonathan Hsieh <jon@cloudera.com> wrote:

> Hey folks,
> I've been testing and finding bugs on a branch of online snapshots for the
> past few days. The good news is that taking an online snapshot seems to be
> fairly robust -- I've been taking online-snapshots as quickly as possible
> on a 5 node cluster being battered by a performance eval random write run.
> As expected we ran into some hiccups. In my last run of the
> PE/online-snapshotting, it looks like 88/100 snapshots succeeded. This is
> ok, some failures are actually expected (the first cut only claims better
> consistency than 'copytable' and 'only-on-a-sunny-day' semantics). From a
> quick viewing of what cause the failed cases, if splits or balancing
> occurs while a snapshotting, the region moves cause the final snapshot
> verification step to abort because we look for the new regions and don't
> know if we have all regions.  We've also found some problems with splits of
> hfilelinks (HBASE-7339), and we've encountered an occasional failed-hang
> clone attempts (HBASE-7352), and an occasional ZK related slow abort.  As
> they are found and characterized,  I've been filing them under HBASE-6055
> (offline-snapshots) or HBASE-7290 (online-snapshots).
> I'm going to switch from bug fixing mode back to patch polishing mode today
> to get some of this committed to the snapshot dev branch.  Here's how I
> hope to deal with them moving forward.
> I'll be polishing the pieces I've been testing (there are about 5-7 patches
> in-flight currently) and putting updated pieces up for review.  There is
> non-trivial overhead maintaining this many patches "in the future".   Since
> this is a dev-branch, I'm going to ask reviewing these initial big
> dev-branch reviews focus on understandability and that your +1's would let
> us punt to follow-on jiras and TODOs more frequently than if you were
> reviewing for trunk.  The sooner we get the skeleton in,  the easier
> collaboration with other folks working and testing the same branch.
>  Ideally, getting the large pieces in would allow follow-ons to be easier
> to review and tackle.  The promise here, of course, is that many of  these
> follow-on jiras, bugs (deadlocks, hangs), and testing evidence will be
> blockers before merging to offline snapshots to trunk and merging online
> snapshots to trunk.
> Sound good?
> We've initially had one snapshot branch (offline snapshots) but I'm
> proposing having two: the offline-snapshot branch and the online-snapshot
> branch.  Jesse's been the master of the offline branch and pushing
> dev-branch patches to that branch (
> https://github.com/jyates/hbase/tree/snapshots).  I'd like to soon begin
> pushing dev-branch *reviewed commits* for online-snapshots to another
> branch. For those following here's an explanation of how I'm working.
> * The latest for review patches will be always be in review boards.
> * Branch committed portions (reviewed and +1'ed for the branch patches) for
> online snapshots will live here
> https://github.com/jmhsieh/hbase/tree/snapshots.  My branch will
> periodically be force pushed to deal with rebases onto constantly updating
> trunk, and to include offline-branch committed  patches.
> * The latest working and consolidated online-snapshot branch (commits
> correspond to HBASE jiras) will live at
> https://github.com/jmhsieh/hbase/tree/snapshots-work .  This branch is
> subject to frequent forced pushes.  It is a cleanup step done to prep
> patches for reviews, and match what eventual commits structure would look
> like.   It also contains some patches that may be abandoned or reordered.
> * Rough incremental in-progress branches live here,
> https://github.com/jmhsieh/hbase/tree/snapshot-work-1213  (change 1213
> with
> the latest date to see where I am).  These rough branches have many small
> commits that focus on functionality and need to be rebased to "sprinkle"
> edits into the appropriate JIRA-corresponding patches.   These branches
>  will rarely if ever be force pushed.  These are what I do testing from,
> and probably are suitable for others to use for testing.  I periodically
> merge this with the snapshots-work mostly as a proof that what I have for
> review is the same as what I've been testing.
> Jon.
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message