hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject [DISCUSSION] Merge of the hbase-11339 mob branch into master.
Date Thu, 28 May 2015 03:26:20 GMT
Inline

On Wednesday, May 27, 2015, Jonathan Hsieh <jon@cloudera.com
<javascript:_e(%7B%7D,'cvml','jon@cloudera.com');>> wrote:

> On Sat, May 23, 2015 at 2:51 PM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> > I was responding to this comment from Jon's email:
> >
> > > Another suggestion was a tool to check that mob references had
> > > corresponding mob data.  We currently include a mr-based sweeper job
> > > that could be used to perform this verification.  We can add this tool
> > and
> > > testing for the tool.
> >
> > ​So for those of us not intimately familiar with the MOB work, is there a
> > tool that checks for MOB integrity, or a tool that can be adapted for
> that
> > purpose, and does it require MR or not? More generally: Can MOB integrity
> > checks be added or folded into HBCK? I think you can see what my concerns
> > are but if they are unclear please let me know and I will clarify them
> > further.
> > ​
> >
> > The main purpose of hbck is to see if region metadata consistent in all
> locations.  hbck checks metadata by scaning hbase:meta, reading .regioninfo
> metadata files, and interrogating region servers.  It does not actually
> access the bulk of the data found in hfiles.  It never had the facilities
> to check validity of data and as designed hbase doesn't have the ability
> doesn't catch things like deleted, missing or truncated hfiles.
>
>
HBCK can check and sideline dangling reference files. I think of MOB files
as "core enough" auxiliary files that need some support. I suppose unlike
reference files their presence or absence won't produce a region open
failure, we would see dangling pointers later when tying to service
queries. (Yes?) Will that abort the RS? Pardon the ignorant question,
normally I could check the code but I'm at the airport on a phone.

Good point below on the difference between HFile.main and HBCK and the
limitations of these tools.

On that subject, I should file follow up issues for more check and repair
options for HFiles. We should be able to detect missing or corrupt files of
all variety: HFile, reference, MOB. This may require an expensive scan over
lots of files, but this is like fsck full disk surface scans and those have
similar costs. Providing MR based tools is fine but we should have
multithreaded tools that can stand in if a MR runtime is not available.
Import, Export, VerifyReplication...all of these tools are in a different,
lesser, class than integrity and repair tools, in my opinion. Since MOB
will likely be merged into trunk by then I'll be sure to include it. I
agree it's not fair to ask more of MOB then what we have now for HFile.


> As such, I don't think integration into hbck is appropriate for mob (or
> checking the validity of any other tag-based feature).


When the tag based features become widely used this will be a distinction
without a difference. All the tools will need to learn about them.


> I do think it makes sense to have a tool more akin to a modified version of
> HFile.main() that can read an hfile and detect bad link between a mob ref
> and a mob value.  Howeve, I actually don't think there is a out of the box
> tool that we can use today to recover a truncated hfile and I don't think
> there is any tool that checks the validity in any of the other tag-based
> features (please correct me if I am wrong)
>
> An alternative since this is a infrequently run operation, would be a tool
> a map reduce job that uses the mob scanner filter to check an entire
> table.  The skeleton for this is essentially written already -- the mob
> sweeper could be modified to output mob integrity violations.  Similarly, I
> don't think validation of a full table's tag data was required for other
> tag-based features (please correct me if I am wrong).
>
> On Sat, May 23, 2015 at 11:15 AM, Matteo Bertozzi <theo.bertozzi@gmail.com
> >
> > wrote:
> >
> > > as far as I know MOB does not depend anymore on MR
> > > the old MR sweeper tool is still around, and you can use it to compact
> > > manually
> > > but it is not called by the normal RS compaction code.
> > >
> > > also, the MOB code is more or less isolated.
> > > if your family is not using MOB you still have your old code path.
> > > so, I'd say that if we don't break compatibility and
> > > the few changes in the core-path, to do the if mobIsEnabled, do not
> > impact
> > > the perf of the traditional path
> > > we can probably get the feature in 1.2 as "experimental".
> > > brave users can experiment with it, report bugs and suggestions
> > > and then we will mark it as stable in 1.3, 1.4 or whenever is ready.
> > >
> > >
> > > Matteo
> > >
> > >
> > > On Sat, May 23, 2015 at 9:47 AM, Andrew Purtell <apurtell@apache.org>
> > > wrote:
> > >
> > > > Maybe we can remove the dependency on a MR runtime for MOB
> maintenance
> > by
> > > > reimplementing those parallel tasks using Procedure V2? We wouldn't
> be
> > > > looking at MOB for 1.2 but maybe 1.3? I'm also not sure the community
> > as
> > > a
> > > > whole has the necessary bandwidth for perf and stability testing of
> MOB
> > > in
> > > > the 1.2 timeframe, but 1.3 would be more likely.
> > > >
> > > >
> > > > On Sat, May 23, 2015 at 9:40 AM, Andrew Purtell <apurtell@apache.org
> >
> > > > wrote:
> > > >
> > > > > Regarding performance testing: Whatever has been done on the MOB
> > branch
> > > > > will be interesting data points, and, potentially encouraging, but
> > > > porting
> > > > > to branch-1 will produce a new code base. Earlier results on other
> > code
> > > > > will not be applicable. We have to start over. Like I said
> elsewhere,
> > > I'm
> > > > > happy to help with (re)characterizing the perf impact and
> > improvements
> > > > > produced by the changes.
> > > > >
> > > > > What coverage do we have for verifying the integrity of MOB
> > references?
> > > > > Will the sweep tool detect, alert on, and optionally repair
> dangling
> > > > > references? (I could answer this for myself by looking at MOB
> branch,
> > > but
> > > > > hopefully someone here has an answer at the ready.) I assume we
> > > calculate
> > > > > and store checksums for MOB data itself so we know if values are
> > > corrupt.
> > > > > Does the sweep tool detect MOB value corruption? Can it be
> repaired?
> > Do
> > > > we
> > > > > have a good ops story for why HBCK is no longer sufficient on its
> > own,
> > > > > there's a separate tool with a whole new set of options - and a
> > > > requirement
> > > > > for a MR runtime! - for checking MOB data? That last one is a
> > > rhetorical
> > > > > question (smile), the ops story is... unsatisfying. It's like we've
> > > > taken a
> > > > > self sufficient HBase and bolted in parts of Hive, so now we need
> MR.
> > > > >
> > > > >
> > > > > On Fri, May 22, 2015 at 1:45 PM, Jonathan Hsieh <jon@cloudera.com>
> > > > wrote:
> > > > >
> > > > >> In another thread andrew purtell brought up some concerns about
> the
> > > mob
> > > > >> feature:
> > > > >>
> > > > >> On Fri, May 22, 2015 at 12:40 PM, Andrew Purtell <
> > apurtell@apache.org
> > > >
> > > > >>  wrote:
> > > > >>
> > > > >> > Another point of clarification, sorry, I hit the send button
too
> > > early
> > > > >> it
> > > > >> > seems: I don't believe MOB is fully integrated yet, for
example
> > the
> > > > >> > feature
> > > > >> > is an extension to store that lacks support for encryption
(this
> > > would
> > > > >> > technically be a feature regression); and HBCK. I have not
been
> > > > >> following
> > > > >> > MOB too closely so could be mistaken. These issues do not
> > preclude a
> > > > >> merge
> > > > >> > of MOB into trunk, but do preclude a merge back of MOB from
> trunk
> > to
> > > > >> > branch-1. I would veto the latter until such shortcomings
in the
> > > > >> > implementation that could be described as regressions are
> > > addressed. I
> > > > >> > would also like to see a performance analysis of a range
of
> > > workloads
> > > > >> > before and after in as much detail as can be mustered, and
would
> > be
> > > > >> happy
> > > > >> > to volunteer to help out with that.
> > > > >> >
> > > > >>
> > > > >> Here's info on the points brought up:
> > > > >>
> > > > >> Encryption support shortcoming is being addrsessed here:
> > > > >> https://issues.apache.org/jira/browse/HBASE-13693 (closed)
> > > > >> https://issues.apache.org/jira/browse/HBASE-13720 (in review)
> > > > >>
> > > > >> Hbck has been actually run against the integration test rigs
while
> > the
> > > > >> feature has been enabled but currently has no explicit unit test
> or
> > > > simple
> > > > >> to run integration test.  It currently doesn't report anything
> > special
> > > > >> about the mob storage area. We can add unit tests that cover
hbck
> > when
> > > > the
> > > > >> mob path is exercised.
> > > > >>
> > > > >> Another suggestion was a tool to check that mob references had
> > > > >> corresponding mob data.  We currently include a mr-based sweeper
> job
> > > > that
> > > > >> could be used to perform this verification.  We can add this
tool
> > and
> > > > >> testing for the tool.
> > > > >>
> > > > >> I've done some performance testing and Jingcheng and his
> colleagues
> > > have
> > > > >> done significant amounts of performance testing. We currently
> have a
> > > > blog
> > > > >> post in progress that will share the results of this performance
> > > > testing.
> > > > >>
> > > > >> Jon.
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Wed, May 20, 2015 at 7:38 PM, Ted Yu <yuzhihong@gmail.com>
> > wrote:
> > > > >>
> > > > >> > This is a useful feature, Jon.
> > > > >> >
> > > > >> > I went over the mega-patch and left some comments on review
> board.
> > > > >> >
> > > > >> > I noticed that hbck was not included in the patch. Neither
did I
> > > find
> > > > a
> > > > >> > sub-task of HBASE-11339 that covers hbck.
> > > > >> >
> > > > >> > Do you or Jingcheng plan to add MOB-aware capability for
hbck ?
> > > > >> >
> > > > >> > Cheers
> > > > >> >
> > > > >> > On Wed, May 20, 2015 at 9:21 AM, Jonathan Hsieh <
> jon@cloudera.com
> > >
> > > > >> wrote:
> > > > >> >
> > > > >> > > Hi folks,
> > > > >> > >
> > > > >> > > The Medium Object (MOB) Storage feature (HBASE-11339[1])
is
> > > modified
> > > > >> I/O
> > > > >> > > and compaction path that allows individual moderately
sized
> > values
> > > > >> > > (10k-10MB) to be stored so that write amplification
is reduced
> > > when
> > > > >> > > compared to the normal I/O path.   At a high level,
it
> provides
> > > > >> alternate
> > > > >> > > flush and compaction mechanisms that segregates large
cells
> > into a
> > > > >> > separate
> > > > >> > > area where they are not subject to potentially frequent
> > compaction
> > > > and
> > > > >> > > splits that can be encountered in the normal I/O path.
A more
> > > > detailed
> > > > >> > > design doc can be found on the hbase-11339 jira.
> > > > >> > >
> > > > >> > > Jingcheng Du has been working on the mob feature for
a while
> and
> > > > >> Anoop,
> > > > >> > Ram
> > > > >> > > and I have been shepherding him through the design
revisions
> and
> > > > >> > > implementation of the feature in the hbase-11339 branch.[2]
> > > > >> > >
> > > > >> > > The branch we are proposing to merge into master is
compatible
> > > with
> > > > >> > HBase's
> > > > >> > > core functionality including snapshots, replication,
shell
> > > support,
> > > > >> > behaves
> > > > >> > > well with table alters, bulk loads and does not require
> external
> > > MR
> > > > >> > > processes. It has been documented, and subject to many
> > integration
> > > > >> test
> > > > >> > > runs  (ITBLL, ITAcidGuarantees, ITIngest) including
fault
> > > injection.
> > > > >> > > Performance testing of the feature shows what can be
a 2x-3x
> > > > >> throughput
> > > > >> > > improvement for workloads that contain mobs. These
results can
> > be
> > > > >> seen on
> > > > >> > > the hbase 2.0 panel discussion slides from hbasecon
(once
> > > > published).
> > > > >> > >
> > > > >> > > Recently there have been some hfile encryption related
> > > shortcomings
> > > > >> that
> > > > >> > we
> > > > >> > > could address in branch or in master.
> > > > >> > >
> > > > >> > > Earlier iterations of the feature has been tested in
> production
> > by
> > > > >> users
> > > > >> > > that Jingcheng has been responsible for.  A version
has also
> > been
> > > > >> > deployed
> > > > >> > > at users I have been responsible for.  Some of the
folks from
> > > Huawei
> > > > >> > > (ashutosh) have also been submitting the recent encryption
bug
> > > > reports
> > > > >> > > against the hbase-11339 branch so there is some evidence
of
> > usage
> > > by
> > > > >> > them.
> > > > >> > >
> > > > >> > > The four of us  (Jingcheng, Ram, Anoop and I) are satisfied
> with
> > > the
> > > > >> > > feature and feel it is a good time to call a merge
vote.  Ive
> > > > posted a
> > > > >> > > megapatch version for folks who want to peruse the
code. [3]
> > > > >> > >
> > > > >> > > What do you all think?
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Jingcheng, Jon, Ram, and Anoop.
> > > > >> > >
> > > > >> > > [1] https://issues.apache.org/jira/browse/HBASE-11339
> > > > >> > > [2] https://github.com/apache/hbase/tree/hbase-11339
> > > > >> > > [3] https://reviews.apache.org/r/34475/
> > > > >> > > --
> > > > >> > > // Jonathan Hsieh (shay)
> > > > >> > > // HBase Tech Lead, Software Engineer, Cloudera
> > > > >> > > // jon@cloudera.com // @jmhsieh
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> // Jonathan Hsieh (shay)
> > > > >> // HBase Tech Lead, Software Engineer, Cloudera
> > > > >> // jon@cloudera.com // @jmhsieh
> > > > >>
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> // Jonathan Hsieh (shay)
> // HBase Tech Lead, Software Engineer, Cloudera
> // jon@cloudera.com // @jmhsieh
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message