hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: [DISCUSSION] Merge of the hbase-11339 mob branch into master.
Date Thu, 28 May 2015 22:45:48 GMT
So the sweep tool is completely optional? A deploy won't degrade if the
sweep tool is never run? Then that sounds good.


On Thursday, May 28, 2015, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> bq.So is a MR runtime required for MOB or not? I read maybe, then no, then
> here maybe again. What happens if one does not have a MR runtime and
> therefore can never run the sweeper tool?
> Just to make it clear, now MOB does not have MR dependency. The V1 version
> had a sweeper tool that was dependent on MR.  The tool exists even now and
> that still depends on MR. Its like an add on.
>
> The compaction of MOB is now embedded as part of the core feature of
> compaction without having to use MR.
>
> Regards
> Ram
>
> On Thu, May 28, 2015 at 10:20 AM, Andrew Purtell <andrew.purtell@gmail.com
> <javascript:;>>
> wrote:
>
> > I have no concerns about MOB in trunk. Go for it.
> >
> > I do have concerns about a subsequent proposal to put it in 1.2. Those
> > concerns center around stability and performance impacts, and a possible
> > dependency on a MR runtime for what I would consider core function.
> >
> > > Regarding the tools and integrity checks,
> > MOB has a tool based on MR basically for sweeping and compaction apart
> from
> > the compactor that runs in the core (without MR dependency).
> >
> > So is a MR runtime required for MOB or not? I read maybe, then no, then
> > here maybe again. What happens if one does not have a MR runtime and
> > therefore can never run the sweeper tool? An incomplete feature on trunk
> > isn't a problem. Later commits can fill in the gaps and then the sum of
> MOB
> > commits can go back to branch-1. (Experimental != incomplete, IMHO.)
> >
> > If as you say stability and performance testing have already be done and
> > both look great, then that means *when* this is done again for a branch-1
> > merge candidate, the results will likely also be good. I'd like to help
> out
> > with this. You won't need to prove it, I will do the legwork for my own
> > concerns.
> >
> >
> > > On May 27, 2015, at 8:59 PM, ramkrishna vasudevan <
> > ramkrishna.s.vasudevan@gmail.com <javascript:;>> wrote:
> > >
> > > Chiming late here,
> > >
> > > As Matt suggested earlier, utmost care had been taken to ensure that
> the
> > > MOB code does not interfere with the normal flow and ensured that
> things
> > > work normally when MOB is not enabled on a family.
> > >
> > > So the entire flow for MOB can be treated as an experimental feature,
> if
> > > need be.  Take the latest case of guys from Huawei, since they have
> some
> > > interest in this feature they are trying the branch hbase-11339 and
> > trying
> > > to see how MOB works.
> > >
> > > If we move this to trunk, then chances of even more people looking into
> > it
> > > and by the time it comes to 1.3 or1.4 we are stable enough.
> > >
> > > Regarding the tools and integrity checks,
> > > MOB has a tool based on MR basically for sweeping and compaction apart
> > from
> > > the compactor that runs in the core (without MR dependency).  We could
> > > always add feature to the existing tool to do integrity checks like Jon
> > > suggests.
> > >
> > > .Also for an experimental feature we could always come up with such a
> > tool,
> > > but in case of MOB the inter dependency on the MOB and actual HFile
> data
> > is
> > > more so just a stand alone too to check integrity on the Hfile may not
> be
> > > easy without having to do some sort of scan on the Hfiles and MOB
> files.
> > > (Not thought on that fully).
> > >
> > > I would still think that having this feature as experimental in 1.2
> makes
> > > sense.  Just my thoughts on this also after being part on the dev
> process
> > > for this feature where we tried not to touch the core areas affecting
> non
> > > MOB cases.
> > >
> > > Some of the perf results performed by Jingcheng's team and Cloudera
> folks
> > > substantiates the gain this feature provides.
> > >
> > > Regards
> > > Ram
> > >
> > >
> > >
> > >
> > >> On Thu, May 28, 2015 at 9:04 AM, Andrew Purtell <apurtell@apache.org
> <javascript:;>>
> > wrote:
> > >>
> > >> Inline
> > >>
> > >>> On Wednesday, May 27, 2015, Jonathan Hsieh <jon@cloudera.com
> <javascript:;>> wrote:
> > >>>
> > >>> On Sat, May 23, 2015 at 9:40 AM, Andrew Purtell <apurtell@apache.org
> <javascript:;>
> > >>> <javascript:;>> wrote:
> > >>>
> > >>>> Regarding performance testing: Whatever has been done on the MOB
> > branch
> > >>>> will be interesting data points, and, potentially encouraging,
but
> > >>> porting
> > >>>> to branch-1 will produce a new code base. Earlier results on other
> > code
> > >>>> will not be applicable. We have to start over. Like I said
> elsewhere,
> > >> I'm
> > >>>> happy to help with (re)characterizing the perf impact and
> improvements
> > >>>> produced by the changes.
> > >>> Thank you for offer for help -- we'd appreciated it!
> > >> You bet.
> > >>
> > >>
> > >>> Although most of my it tests and perf tests results were done against
> > >>> against trunk (from sept '14 and then later feb '15 -- we've been
> doing
> > >>> them roughly every two weeks now) Jingcheng's most recent performance
> > >>> testing and fault injection testing results were actually done
> against
> > a
> > >>> version merged/rebased on to hbase 1.0.0[1].  Though not on the most
> > >> recent
> > >>> branch-1, would this be close enough and sufficient or would you
> still
> > >> want
> > >>> to redoing them?
> > >>
> > >>
> > >> Closer, yes.
> > >>
> > >> Redo on the branch-1 merge proposal would be important as a confidence
> > >> builder still I believe.
> > >>
> > >>
> > >>>
> > >>> If we want to redo them when we have a 1.x backport is ready to
> > propose,
> > >>> we'll include the augmented ltt[2] that will make it easy to exercise
> > the
> > >>> mob feature's performance.
> > >>>
> > >>> [1]
> https://github.com/cloudera/hbase/commits/cdh5-1.0.0_5.4.0?page=2
> > >>> (this is cdh5.4.0's hbase 1.0.0-based hbase)
> > >>> [2] https://issues.apache.org/jira/browse/HBASE-13277
> > >>>
> > >>>
> > >>> What coverage do we have for verifying the integrity of MOB
> references?
> > >>>> Will the sweep tool detect, alert on, and optionally repair dangling
> > >>>> references? (I could answer this for myself by looking at MOB
> branch,
> > >> but
> > >>>> hopefully someone here has an answer at the ready.) I assume we
> > >> calculate
> > >>>> and store checksums for MOB data itself so we know if values are
> > >> corrupt.
> > >>>> Does the sweep tool detect MOB value corruption? Can it be repaired?
> > Do
> > >>> we
> > >>>> have a good ops story for why HBCK is no longer sufficient on its
> own,
> > >>>> there's a separate tool with a whole new set of options - and a
> > >>> requirement
> > >>>> for a MR runtime! - for checking MOB data? That last one is a
> > >> rhetorical
> > >>>> question (smile), the ops story is... unsatisfying. It's like we've
> > >>> taken a
> > >>>> self sufficient HBase and bolted in parts of Hive, so now we need
> MR.
> > >>>>
> > >>>> Our internal compaction detects and alerts at warn level if there
> is a
> > >>> missing link [3], and then returns a empty value [4]
> > >>
> > >>
> > >> Ok, thanks
> > >>
> > >>
> > >>> Mobs are stored in hfiles so we have the same checksumming all other
> > >> hfiles
> > >>> have.
> > >>
> > >>
> > >> Ok, thanks
> > >>
> > >>
> > >>>
> > >>> In the other response, I answered about hbck and how something like
> > >>> Hfile.main() could be a more appropriate checking tool to address
> this
> > >>> situation.
> > >>
> > >>
> > >> Ok. Replied there.
> > >>
> > >>
> > >>>
> > >>> I'm afraid then much of our complete operational story is
> > "unsatisfying"
> > >>
> > >> even without mob because it still requires MR -- e.g. copytable,
> export,
> > >>> import, walplayer, or verifyreplicaion mr jobs. While I'll agree that
> > >>> having an external system is undesirable and unacceptable for what
> are
> > >>> mandatory internal operations like compactions, I think requiring mr
> > for
> > >> a
> > >>> verifiymob mr job would as acceptable as the verfiyreplication job.
> > >>
> > >>
> > >> I think integrity checks are a different class of tool than all others
> > and
> > >> we shouldn't mandate the presence of a MR runtime to execute those.
> > OTOH,
> > >> it's reasonable to provide a standalone tool (if multithreaded) but
> > >> then also a recommended MR version that can achieve better
> parallelism.
> > >>
> > >>
> > >>>
> > >>> [3]
> > >>
> >
> https://github.com/apache/hbase/blob/hbase-11339/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java#L400
> > >>> [4]
> > >>
> >
> https://github.com/apache/hbase/blob/hbase-11339/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobCompactor.java#L224
> > >>>
> > >>>>
> > >>>>> On Fri, May 22, 2015 at 1:45 PM, Jonathan Hsieh <jon@cloudera.com
> <javascript:;>
> > >>>> <javascript:;>> wrote:
> > >>>>
> > >>>>> In another thread andrew purtell brought up some concerns about
the
> > >> mob
> > >>>>> feature:
> > >>>>>
> > >>>>> On Fri, May 22, 2015 at 12:40 PM, Andrew Purtell <
> > >> apurtell@apache.org <javascript:;>
> > >>> <javascript:;>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Another point of clarification, sorry, I hit the send button
too
> > >>> early
> > >>>> it
> > >>>>>> seems: I don't believe MOB is fully integrated yet, for
example
> the
> > >>>>>> feature
> > >>>>>> is an extension to store that lacks support for encryption
(this
> > >>> would
> > >>>>>> technically be a feature regression); and HBCK. I have
not been
> > >>>> following
> > >>>>>> MOB too closely so could be mistaken. These issues do not
preclude
> > >> a
> > >>>>> merge
> > >>>>>> of MOB into trunk, but do preclude a merge back of MOB
from trunk
> > >> to
> > >>>>>> branch-1. I would veto the latter until such shortcomings
in the
> > >>>>>> implementation that could be described as regressions are
> > >> addressed.
> > >>> I
> > >>>>>> would also like to see a performance analysis of a range
of
> > >> workloads
> > >>>>>> before and after in as much detail as can be mustered,
and would
> be
> > >>>> happy
> > >>>>>> to volunteer to help out with that.
> > >>>>>
> > >>>>> Here's info on the points brought up:
> > >>>>>
> > >>>>> Encryption support shortcoming is being addrsessed here:
> > >>>>> https://issues.apache.org/jira/browse/HBASE-13693 (closed)
> > >>>>> https://issues.apache.org/jira/browse/HBASE-13720 (in review)
> > >>>>>
> > >>>>> Hbck has been actually run against the integration test rigs
while
> > >> the
> > >>>>> feature has been enabled but currently has no explicit unit
test or
> > >>>> simple
> > >>>>> to run integration test.  It currently doesn't report anything
> > >> special
> > >>>>> about the mob storage area. We can add unit tests that cover
hbck
> > >> when
> > >>>> the
> > >>>>> mob path is exercised.
> > >>>>>
> > >>>>> Another suggestion was a tool to check that mob references
had
> > >>>>> corresponding mob data.  We currently include a mr-based sweeper
> job
> > >>> that
> > >>>>> could be used to perform this verification.  We can add this
tool
> and
> > >>>>> testing for the tool.
> > >>>>>
> > >>>>> I've done some performance testing and Jingcheng and his colleagues
> > >>> have
> > >>>>> done significant amounts of performance testing. We currently
have
> a
> > >>> blog
> > >>>>> post in progress that will share the results of this performance
> > >>> testing.
> > >>>>>
> > >>>>> Jon.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Wed, May 20, 2015 at 7:38 PM, Ted Yu <yuzhihong@gmail.com
> <javascript:;>
> > >>> <javascript:;>> wrote:
> > >>>>>
> > >>>>>> This is a useful feature, Jon.
> > >>>>>>
> > >>>>>> I went over the mega-patch and left some comments on review
board.
> > >>>>>>
> > >>>>>> I noticed that hbck was not included in the patch. Neither
did I
> > >>> find a
> > >>>>>> sub-task of HBASE-11339 that covers hbck.
> > >>>>>>
> > >>>>>> Do you or Jingcheng plan to add MOB-aware capability for
hbck ?
> > >>>>>>
> > >>>>>> Cheers
> > >>>>>>
> > >>>>>> On Wed, May 20, 2015 at 9:21 AM, Jonathan Hsieh <jon@cloudera.com
> <javascript:;>
> > >>> <javascript:;>>
> > >>>>> wrote:
> > >>>>>>
> > >>>>>>> Hi folks,
> > >>>>>>>
> > >>>>>>> The Medium Object (MOB) Storage feature (HBASE-11339[1])
is
> > >>> modified
> > >>>>> I/O
> > >>>>>>> and compaction path that allows individual moderately
sized
> > >> values
> > >>>>>>> (10k-10MB) to be stored so that write amplification
is reduced
> > >> when
> > >>>>>>> compared to the normal I/O path.   At a high level,
it provides
> > >>>>> alternate
> > >>>>>>> flush and compaction mechanisms that segregates large
cells into
> > >> a
> > >>>>>> separate
> > >>>>>>> area where they are not subject to potentially frequent
> > >> compaction
> > >>>> and
> > >>>>>>> splits that can be encountered in the normal I/O path.
A more
> > >>>> detailed
> > >>>>>>> design doc can be found on the hbase-11339 jira.
> > >>>>>>>
> > >>>>>>> Jingcheng Du has been working on the mob feature for
a while and
> > >>>> Anoop,
> > >>>>>> Ram
> > >>>>>>> and I have been shepherding him through the design
revisions and
> > >>>>>>> implementation of the feature in the hbase-11339 branch.[2]
> > >>>>>>>
> > >>>>>>> The branch we are proposing to merge into master is
compatible
> > >> with
> > >>>>>> HBase's
> > >>>>>>> core functionality including snapshots, replication,
shell
> > >> support,
> > >>>>>> behaves
> > >>>>>>> well with table alters, bulk loads and does not require
external
> > >> MR
> > >>>>>>> processes. It has been documented, and subject to many
> > >> integration
> > >>>> test
> > >>>>>>> runs  (ITBLL, ITAcidGuarantees, ITIngest) including
fault
> > >>> injection.
> > >>>>>>> Performance testing of the feature shows what can be
a 2x-3x
> > >>>> throughput
> > >>>>>>> improvement for workloads that contain mobs. These
results can be
> > >>>> seen
> > >>>>> on
> > >>>>>>> the hbase 2.0 panel discussion slides from hbasecon
(once
> > >>> published).
> > >>>>>>>
> > >>>>>>> Recently there have been some hfile encryption related
> > >> shortcomings
> > >>>>> that
> > >>>>>> we
> > >>>>>>> could address in branch or in master.
> > >>>>>>>
> > >>>>>>> Earlier iterations of the feature has been tested in
production
> > >> by
> > >>>>> users
> > >>>>>>> that Jingcheng has been responsible for.  A version
has also been
> > >>>>>> deployed
> > >>>>>>> at users I have been responsible for.  Some of the
folks from
> > >>> Huawei
> > >>>>>>> (ashutosh) have also been submitting the recent encryption
bug
> > >>>> reports
> > >>>>>>> against the hbase-11339 branch so there is some evidence
of usage
> > >>> by
> > >>>>>> them.
> > >>>>>>>
> > >>>>>>> The four of us  (Jingcheng, Ram, Anoop and I) are satisfied
with
> > >>> the
> > >>>>>>> feature and feel it is a good time to call a merge
vote.  Ive
> > >>> posted
> > >>>> a
> > >>>>>>> megapatch version for folks who want to peruse the
code. [3]
> > >>>>>>>
> > >>>>>>> What do you all think?
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> Jingcheng, Jon, Ram, and Anoop.
> > >>>>>>>
> > >>>>>>> [1] https://issues.apache.org/jira/browse/HBASE-11339
> > >>>>>>> [2] https://github.com/apache/hbase/tree/hbase-11339
> > >>>>>>> [3] https://reviews.apache.org/r/34475/
> > >>>>>>> --
> > >>>>>>> // Jonathan Hsieh (shay)
> > >>>>>>> // HBase Tech Lead, Software Engineer, Cloudera
> > >>>>>>> // jon@cloudera.com <javascript:;> <javascript:;>
// @jmhsieh
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> // Jonathan Hsieh (shay)
> > >>>>> // HBase Tech Lead, Software Engineer, Cloudera
> > >>>>> // jon@cloudera.com <javascript:;> <javascript:;>
// @jmhsieh
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Best regards,
> > >>>>
> > >>>>   - Andy
> > >>>>
> > >>>> Problems worthy of attack prove their worth by hitting back. -
Piet
> > >> Hein
> > >>>> (via Tom White)
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> // Jonathan Hsieh (shay)
> > >>> // HBase Tech Lead, Software Engineer, Cloudera
> > >>> // jon@cloudera.com <javascript:;> <javascript:;> // @jmhsieh
> > >>
> > >>
> > >> --
> > >> Best regards,
> > >>
> > >>   - Andy
> > >>
> > >> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > >> (via Tom White)
> > >>
> >
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message