hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: [DISCUSSION] Merge of the hbase-11339 mob branch into master.
Date Sat, 23 May 2015 21:51:21 GMT
I was responding to this comment from Jon's email:

> Another suggestion was a tool to check that mob references had
> corresponding mob data.  We currently include a mr-based sweeper job
> that could be used to perform this verification.  We can add this tool and
> testing for the tool.

​So for those of us not intimately familiar with the MOB work, is there a
tool that checks for MOB integrity, or a tool that can be adapted for that
purpose, and does it require MR or not? More generally: Can MOB integrity
checks be added or folded into HBCK? I think you can see what my concerns
are but if they are unclear please let me know and I will clarify them
further.
​

On Sat, May 23, 2015 at 11:15 AM, Matteo Bertozzi <theo.bertozzi@gmail.com>
wrote:

> as far as I know MOB does not depend anymore on MR
> the old MR sweeper tool is still around, and you can use it to compact
> manually
> but it is not called by the normal RS compaction code.
>
> also, the MOB code is more or less isolated.
> if your family is not using MOB you still have your old code path.
> so, I'd say that if we don't break compatibility and
> the few changes in the core-path, to do the if mobIsEnabled, do not impact
> the perf of the traditional path
> we can probably get the feature in 1.2 as "experimental".
> brave users can experiment with it, report bugs and suggestions
> and then we will mark it as stable in 1.3, 1.4 or whenever is ready.
>
>
> Matteo
>
>
> On Sat, May 23, 2015 at 9:47 AM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> > Maybe we can remove the dependency on a MR runtime for MOB maintenance by
> > reimplementing those parallel tasks using Procedure V2? We wouldn't be
> > looking at MOB for 1.2 but maybe 1.3? I'm also not sure the community as
> a
> > whole has the necessary bandwidth for perf and stability testing of MOB
> in
> > the 1.2 timeframe, but 1.3 would be more likely.
> >
> >
> > On Sat, May 23, 2015 at 9:40 AM, Andrew Purtell <apurtell@apache.org>
> > wrote:
> >
> > > Regarding performance testing: Whatever has been done on the MOB branch
> > > will be interesting data points, and, potentially encouraging, but
> > porting
> > > to branch-1 will produce a new code base. Earlier results on other code
> > > will not be applicable. We have to start over. Like I said elsewhere,
> I'm
> > > happy to help with (re)characterizing the perf impact and improvements
> > > produced by the changes.
> > >
> > > What coverage do we have for verifying the integrity of MOB references?
> > > Will the sweep tool detect, alert on, and optionally repair dangling
> > > references? (I could answer this for myself by looking at MOB branch,
> but
> > > hopefully someone here has an answer at the ready.) I assume we
> calculate
> > > and store checksums for MOB data itself so we know if values are
> corrupt.
> > > Does the sweep tool detect MOB value corruption? Can it be repaired? Do
> > we
> > > have a good ops story for why HBCK is no longer sufficient on its own,
> > > there's a separate tool with a whole new set of options - and a
> > requirement
> > > for a MR runtime! - for checking MOB data? That last one is a
> rhetorical
> > > question (smile), the ops story is... unsatisfying. It's like we've
> > taken a
> > > self sufficient HBase and bolted in parts of Hive, so now we need MR.
> > >
> > >
> > > On Fri, May 22, 2015 at 1:45 PM, Jonathan Hsieh <jon@cloudera.com>
> > wrote:
> > >
> > >> In another thread andrew purtell brought up some concerns about the
> mob
> > >> feature:
> > >>
> > >> On Fri, May 22, 2015 at 12:40 PM, Andrew Purtell <apurtell@apache.org
> >
> > >>  wrote:
> > >>
> > >> > Another point of clarification, sorry, I hit the send button too
> early
> > >> it
> > >> > seems: I don't believe MOB is fully integrated yet, for example the
> > >> > feature
> > >> > is an extension to store that lacks support for encryption (this
> would
> > >> > technically be a feature regression); and HBCK. I have not been
> > >> following
> > >> > MOB too closely so could be mistaken. These issues do not preclude
a
> > >> merge
> > >> > of MOB into trunk, but do preclude a merge back of MOB from trunk
to
> > >> > branch-1. I would veto the latter until such shortcomings in the
> > >> > implementation that could be described as regressions are
> addressed. I
> > >> > would also like to see a performance analysis of a range of
> workloads
> > >> > before and after in as much detail as can be mustered, and would be
> > >> happy
> > >> > to volunteer to help out with that.
> > >> >
> > >>
> > >> Here's info on the points brought up:
> > >>
> > >> Encryption support shortcoming is being addrsessed here:
> > >> https://issues.apache.org/jira/browse/HBASE-13693 (closed)
> > >> https://issues.apache.org/jira/browse/HBASE-13720 (in review)
> > >>
> > >> Hbck has been actually run against the integration test rigs while the
> > >> feature has been enabled but currently has no explicit unit test or
> > simple
> > >> to run integration test.  It currently doesn't report anything special
> > >> about the mob storage area. We can add unit tests that cover hbck when
> > the
> > >> mob path is exercised.
> > >>
> > >> Another suggestion was a tool to check that mob references had
> > >> corresponding mob data.  We currently include a mr-based sweeper job
> > that
> > >> could be used to perform this verification.  We can add this tool and
> > >> testing for the tool.
> > >>
> > >> I've done some performance testing and Jingcheng and his colleagues
> have
> > >> done significant amounts of performance testing. We currently have a
> > blog
> > >> post in progress that will share the results of this performance
> > testing.
> > >>
> > >> Jon.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Wed, May 20, 2015 at 7:38 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >>
> > >> > This is a useful feature, Jon.
> > >> >
> > >> > I went over the mega-patch and left some comments on review board.
> > >> >
> > >> > I noticed that hbck was not included in the patch. Neither did I
> find
> > a
> > >> > sub-task of HBASE-11339 that covers hbck.
> > >> >
> > >> > Do you or Jingcheng plan to add MOB-aware capability for hbck ?
> > >> >
> > >> > Cheers
> > >> >
> > >> > On Wed, May 20, 2015 at 9:21 AM, Jonathan Hsieh <jon@cloudera.com>
> > >> wrote:
> > >> >
> > >> > > Hi folks,
> > >> > >
> > >> > > The Medium Object (MOB) Storage feature (HBASE-11339[1]) is
> modified
> > >> I/O
> > >> > > and compaction path that allows individual moderately sized values
> > >> > > (10k-10MB) to be stored so that write amplification is reduced
> when
> > >> > > compared to the normal I/O path.   At a high level, it provides
> > >> alternate
> > >> > > flush and compaction mechanisms that segregates large cells into
a
> > >> > separate
> > >> > > area where they are not subject to potentially frequent compaction
> > and
> > >> > > splits that can be encountered in the normal I/O path. A more
> > detailed
> > >> > > design doc can be found on the hbase-11339 jira.
> > >> > >
> > >> > > Jingcheng Du has been working on the mob feature for a while
and
> > >> Anoop,
> > >> > Ram
> > >> > > and I have been shepherding him through the design revisions
and
> > >> > > implementation of the feature in the hbase-11339 branch.[2]
> > >> > >
> > >> > > The branch we are proposing to merge into master is compatible
> with
> > >> > HBase's
> > >> > > core functionality including snapshots, replication, shell
> support,
> > >> > behaves
> > >> > > well with table alters, bulk loads and does not require external
> MR
> > >> > > processes. It has been documented, and subject to many integration
> > >> test
> > >> > > runs  (ITBLL, ITAcidGuarantees, ITIngest) including fault
> injection.
> > >> > > Performance testing of the feature shows what can be a 2x-3x
> > >> throughput
> > >> > > improvement for workloads that contain mobs. These results can
be
> > >> seen on
> > >> > > the hbase 2.0 panel discussion slides from hbasecon (once
> > published).
> > >> > >
> > >> > > Recently there have been some hfile encryption related
> shortcomings
> > >> that
> > >> > we
> > >> > > could address in branch or in master.
> > >> > >
> > >> > > Earlier iterations of the feature has been tested in production
by
> > >> users
> > >> > > that Jingcheng has been responsible for.  A version has also
been
> > >> > deployed
> > >> > > at users I have been responsible for.  Some of the folks from
> Huawei
> > >> > > (ashutosh) have also been submitting the recent encryption bug
> > reports
> > >> > > against the hbase-11339 branch so there is some evidence of usage
> by
> > >> > them.
> > >> > >
> > >> > > The four of us  (Jingcheng, Ram, Anoop and I) are satisfied with
> the
> > >> > > feature and feel it is a good time to call a merge vote.  Ive
> > posted a
> > >> > > megapatch version for folks who want to peruse the code. [3]
> > >> > >
> > >> > > What do you all think?
> > >> > >
> > >> > > Thanks,
> > >> > > Jingcheng, Jon, Ram, and Anoop.
> > >> > >
> > >> > > [1] https://issues.apache.org/jira/browse/HBASE-11339
> > >> > > [2] https://github.com/apache/hbase/tree/hbase-11339
> > >> > > [3] https://reviews.apache.org/r/34475/
> > >> > > --
> > >> > > // Jonathan Hsieh (shay)
> > >> > > // HBase Tech Lead, Software Engineer, Cloudera
> > >> > > // jon@cloudera.com // @jmhsieh
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> // Jonathan Hsieh (shay)
> > >> // HBase Tech Lead, Software Engineer, Cloudera
> > >> // jon@cloudera.com // @jmhsieh
> > >>
> > >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message