hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: [DISCUSSION] MR jobs started by Master or RS
Date Fri, 23 Sep 2016 19:24:25 GMT
>>  -1 on that backup be in core hbase

Not sure I understand what it means.

1. We are not allowed to use Master to orchestrate the whole process? We
have already brought up all advantages of using
   Master and distributed procedures for backup and restore.


Downside of moving this to client tool is lack of fault tolerance:
 1.1 Client won't be allowed to do any operations, that can, potentially affect
cluster, such as disabling splits/merges, balancer.
 1.2 In case of client failure who will be doing the whole rollback stuff?
We are trying to make it atomic.

Security is not clear.

2. We are not allowed to modify code of existing HBase core classes (what
does core mean anyway)?

3. We are not allowed to create backup system table (hbase:backup) in a
system space? Only in user space? The table is global.

2. is critical. Despite the fact, that 95% of code is new, we have touched,
of course some existing HBase code.
3. is not that critical, of course we can move backup system into user
space.

And finally, will moving backup into external tool give us +1 from stack?

-Vlad





On Fri, Sep 23, 2016 at 11:26 AM, Stack <stack@duboce.net> wrote:

> On Fri, Sep 23, 2016 at 11:22 AM, Vladimir Rodionov <
> vladrodionov@gmail.com>
> wrote:
>
> > >> + MR is dead
> >
> > Does MR know that? :)
> >
> > Again. With all due respect, stack - still no suggestions what should we
> > use for "bulk data move and transformation" instead of MR?
> >
>
> Use whatever distributed engine suits your fancy -- MR, Spark, distributed
> shell -- just don't have HBase core depend on it, even optionally.
>
>
> > I suggest voting first on "do we need backup in HBase"? In my opinion,
> some
> > group members still not sure about that and some will give -1
> > in any case. Just because ...
> >
> >
> We could run a vote, sure. -1 on that backup be in core hbase (+1 on adding
> all the API any such external tool might need to run).
>
> St.Ack
>
>
>
> > -Vlad
> >
> >
> >
> >
> >
> >
> > On Fri, Sep 23, 2016 at 10:57 AM, Stack <stack@duboce.net> wrote:
> >
> > > On Fri, Sep 23, 2016 at 6:46 AM, Matteo Bertozzi <
> > theo.bertozzi@gmail.com>
> > > wrote:
> > >
> > > > let me try to go back to my original topic.
> > > > this question was meant to be generic, and provide some rule for
> future
> > > > code.
> > > >
> > > > from what I can gather, a rule that may satisfy everyone can be:
> > > >  - we don't want any core feature (e.g. compaction/log-split/log-
> > reply)
> > > > over MR, because some cluster may not want or may have an
> > > > external/uncontrolled MR setup.
> > > >
> > >
> > > +1
> > >
> > >
> > > >  - we allow non-core features (e.g. features enabled by a flag) to
> run
> > MR
> > > > jobs from hbase, because unless you use the feature, MR is not
> > required.
> > > >
> > > >
> > > -1 to hbase core depending on MR or core -- whether behind a flag or
> not
> > --
> > > ever being able to launch MR jobs.
> > >
> > > + MR is dead. We should be busy working hard to undo it from
> hbase-server
> > > moving it out to be an optional module (Spark would be its peer).
> > > + Master is a rats nest of state. Matteo, Stephen, and Appy are busy
> > > working hard on moving it up on to a new foundation. Lets not clutter
> > task
> > > harder by piling on more moving parts.
> > >
> > > St.Ack
> > >
> > >
> > > > Matteo
> > > >
> > > >
> > > > On Fri, Sep 23, 2016 at 5:39 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > >
> > > > > I suggest you look at Matteo's work for AssignmentManager which is
> to
> > > > make
> > > > > Master more stable.
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Fri, Sep 23, 2016 at 5:32 AM, 张铎 <palomino219@gmail.com>
wrote:
> > > > >
> > > > > > No, not your fault, at lease, not this time:)
> > > > > >
> > > > > > Why I call the code ugly? Can you simply tell me the sequence
of
> > > calls
> > > > > when
> > > > > > starting up the HMaster? HMaster is also a regionserver so it
> > extends
> > > > > > HRegionServer, and the initialization of HRegionServer sometimes
> > > needs
> > > > to
> > > > > > make rpc calls to HMaster. A simple change would cause
> > probabilistic
> > > > dead
> > > > > > lock or some strange NPEs...
> > > > > >
> > > > > > That's why I'm very nervous when somebody wants to add new
> features
> > > or
> > > > > add
> > > > > > external dependencies to HMaster, especially add more works
for
> the
> > > > start
> > > > > > up processing...
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > 2016-09-23 20:02 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:
> > > > > >
> > > > > > > I read through HADOOP-13433
> > > > > > > <https://issues.apache.org/jira/browse/HADOOP-13433>
- the
> cited
> > > > race
> > > > > > > condition is in jdk.
> > > > > > >
> > > > > > > Suggest pinging the reviewer on JIRA to get it moving.
> > > > > > >
> > > > > > > bq. But the ugly code in HMaster is readlly a problem...
> > > > > > >
> > > > > > > Can you be specific as to which code is ugly ? Is it in
the
> > backup
> > > /
> > > > > > > restore mega patch ?
> > > > > > >
> > > > > > > Cheers
> > > > > > >
> > > > > > > On Thu, Sep 22, 2016 at 10:44 PM, 张铎 <palomino219@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > If you guys have already implemented the feature in
the MR
> way
> > > and
> > > > > the
> > > > > > > > patch is ready for landing on master, I'm a -0 on
it as I do
> > not
> > > > want
> > > > > > to
> > > > > > > > block the development progress.
> > > > > > > >
> > > > > > > > But I strongly suggest later we need to revisit the
design
> and
> > > see
> > > > if
> > > > > > we
> > > > > > > > can seperated the logic from HMaster as much as possible.
HA
> is
> > > > not a
> > > > > > big
> > > > > > > > problem if you do not store any metada locally. But
the ugly
> > code
> > > > in
> > > > > > > > HMaster is readlly a problem...
> > > > > > > >
> > > > > > > > And for security, I have a issue pending for a long
time. Can
> > > > someone
> > > > > > > help
> > > > > > > > taking a simple look at it? This is what I mean, ugly
code...
> > > > logout
> > > > > > and
> > > > > > > > destroy the credentials in a subject when it is still
being
> > used,
> > > > and
> > > > > > > > declared as LimitPrivacy so I can not change the behivor
and
> > the
> > > > only
> > > > > > way
> > > > > > > > to fix it is to write another piece of ugly code...
> > > > > > > >
> > > > > > > > https://issues.apache.org/jira/browse/HADOOP-13433
> > > > > > > >
> > > > > > > > 2016-09-23 12:53 GMT+08:00 Vladimir Rodionov <
> > > > vladrodionov@gmail.com
> > > > > >:
> > > > > > > >
> > > > > > > > > >> If in the future, we find better ways
of doing this
> > without
> > > > > using
> > > > > > > MR,
> > > > > > > > we
> > > > > > > > > can certainly consider that
> > > > > > > > >
> > > > > > > > > Our framework for distributed operations is abstract
and
> > allows
> > > > > > > > > different implementations. MR is just one implementation
we
> > > > > provide.
> > > > > > > > >
> > > > > > > > > -Vlad
> > > > > > > > >
> > > > > > > > > On Thu, Sep 22, 2016 at 9:38 PM, Devaraj Das
<
> > > > ddas@hortonworks.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Guys, first off apologies for bringing in
the topic of
> > > MR-based
> > > > > > > > > > compactions.. But I was thinking more about
the
> > SpliceMachine
> > > > > > > approach
> > > > > > > > of
> > > > > > > > > > managing compactions in Spark where apparently
they saw a
> > lot
> > > > of
> > > > > > > > > benefits.
> > > > > > > > > > Apologies for giving you that sore throat
Andrew; I
> really
> > > > didn't
> > > > > > > mean
> > > > > > > > to
> > > > > > > > > > :-)
> > > > > > > > > >
> > > > > > > > > > So on this issue, we have these on the plate:
> > > > > > > > > > 0. Somehow not use MR but something like
that
> > > > > > > > > > 1. Run a standalone service other than master
> > > > > > > > > > 2. Shell out from the master
> > > > > > > > > >
> > > > > > > > > > I don't think we have a good answer to (0),
and I don't
> > think
> > > > > it's
> > > > > > > even
> > > > > > > > > > worth the effort of trying to build something
when MR is
> > > > already
> > > > > > > there,
> > > > > > > > > and
> > > > > > > > > > being used by HBase already for some operations.
> > > > > > > > > >
> > > > > > > > > > On (1), we have to deal with a myriad of
issues - HA of
> the
> > > > > server
> > > > > > > not
> > > > > > > > > > being the least of them all. Security (kerberos
> > > authentication,
> > > > > > > another
> > > > > > > > > > keytab to manage, etc. etc. etc.). IMO,
that approach is
> > DOA.
> > > > > > Instead
> > > > > > > > > let's
> > > > > > > > > > substitute that (1) with the HBase Master.
I haven't seen
> > any
> > > > > good
> > > > > > > > reason
> > > > > > > > > > why the HBase master shouldn't launch MR
jobs if needed.
> > It's
> > > > not
> > > > > > > > ideal;
> > > > > > > > > > agreed.
> > > > > > > > > >
> > > > > > > > > > Now before going to (2), let's see what
are the benefits
> of
> > > > > running
> > > > > > > the
> > > > > > > > > > backup/restore jobs from the master. I think
Ted has
> > > summarized
> > > > > > some
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > issues that we need to take care of - basically,
the
> master
> > > can
> > > > > > keep
> > > > > > > > > track
> > > > > > > > > > of running jobs, and should it fail, the
backup master
> can
> > > > > continue
> > > > > > > > > keeping
> > > > > > > > > > track of it (since the jobId would have
been recorded in
> > the
> > > > proc
> > > > > > > WAL).
> > > > > > > > > The
> > > > > > > > > > master can also do cleanup, etc. of failed
backup/restore
> > > > > > processes.
> > > > > > > > > > Security is another issue - the job needs
to run as
> 'hbase'
> > > > since
> > > > > > it
> > > > > > > > owns
> > > > > > > > > > the data. Having the master launch the job
makes it get
> > that
> > > > > > > privilege.
> > > > > > > > > In
> > > > > > > > > > the (2) approach, it's hard to do some of
the above
> > > management.
> > > > > > > > > >
> > > > > > > > > > Guys, just to reiterate, the patch as such
is ready from
> > the
> > > > > > overall
> > > > > > > > > > design/arch point of view (maybe code review
is still
> > pending
> > > > > from
> > > > > > > > > Matteo).
> > > > > > > > > > If in the future, we find better ways of
doing this
> without
> > > > using
> > > > > > MR,
> > > > > > > > we
> > > > > > > > > > can certainly consider that. But IMO don't
think we
> should
> > > > block
> > > > > > this
> > > > > > > > > patch
> > > > > > > > > > from getting merged.
> > > > > > > > > >
> > > > > > > > > > ________________________________________
> > > > > > > > > > From: 张铎 <palomino219@gmail.com>
> > > > > > > > > > Sent: Thursday, September 22, 2016 8:32
PM
> > > > > > > > > > To: dev@hbase.apache.org
> > > > > > > > > > Subject: Re: [DISCUSSION] MR jobs started
by Master or RS
> > > > > > > > > >
> > > > > > > > > > So what about a standalone service other
than master? You
> > can
> > > > use
> > > > > > > your
> > > > > > > > > own
> > > > > > > > > > procedure store in that service?
> > > > > > > > > >
> > > > > > > > > > 2016-09-23 11:28 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:
> > > > > > > > > >
> > > > > > > > > > > An earlier implementation was client
driven.
> > > > > > > > > > >
> > > > > > > > > > > But with that approach, it is hard
to resume if there
> is
> > > > error
> > > > > > > > midway.
> > > > > > > > > > > Using Procedure V2 makes the backup
/ restore more
> > robust.
> > > > > > > > > > >
> > > > > > > > > > > Another consideration is for security.
It is hard to
> > > enforce
> > > > > > > security
> > > > > > > > > (to
> > > > > > > > > > > be implemented) for client driven actions.
> > > > > > > > > > >
> > > > > > > > > > > Cheers
> > > > > > > > > > >
> > > > > > > > > > > > On Sep 22, 2016, at 8:15 PM, Andrew
Purtell <
> > > > > > > > > andrew.purtell@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > No, this misses Matteo's finer
point, which is
> > "shelling
> > > > out"
> > > > > > > from
> > > > > > > > > the
> > > > > > > > > > > master directly to run MR is a first.
Why not drive
> this
> > > > with a
> > > > > > > > utility
> > > > > > > > > > > derived from Tool?
> > > > > > > > > > > >
> > > > > > > > > > > > On Sep 22, 2016, at 7:57 PM, Vladimir
Rodionov <
> > > > > > > > > vladrodionov@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >>>> In our production
cluster,  it is a common case we
> > > just
> > > > > have
> > > > > > > > HDFS
> > > > > > > > > > and
> > > > > > > > > > > >>>> HBase deployed.
> > > > > > > > > > > >>>> If our Master/RS depend
on MR framework
> (especially
> > > some
> > > > > > > > features
> > > > > > > > > we
> > > > > > > > > > > >>>> have not used at all),
 it introduced another cost
> > for
> > > > > > > maintain.
> > > > > > > > > I
> > > > > > > > > > > >>>> don't think it is
a good idea.
> > > > > > > > > > > >>
> > > > > > > > > > > >> So , you are not backup users
in this case. Many our
> > > > > customers
> > > > > > > > have
> > > > > > > > > > full
> > > > > > > > > > > >> stack deployed and
> > > > > > > > > > > >> want see backup to be a standard
feature. Besides
> > this,
> > > > > > nothing
> > > > > > > > will
> > > > > > > > > > > happen
> > > > > > > > > > > >> in your cluster
> > > > > > > > > > > >> if you won't be doing backups.
> > > > > > > > > > > >>
> > > > > > > > > > > >> This discussion (we do not
want see M/R dependency)
> > goes
> > > > to
> > > > > > > > nowhere.
> > > > > > > > > > We
> > > > > > > > > > > >> asked already, at least twice,
to suggest another
> > > > framework
> > > > > > > (other
> > > > > > > > > > than
> > > > > > > > > > > M/R)
> > > > > > > > > > > >> for bulk data copy with *conversion*.
Still waiting
> > for
> > > > > > > > suggestions.
> > > > > > > > > > > >>
> > > > > > > > > > > >> -Vlad
> > > > > > > > > > > >>
> > > > > > > > > > > >>
> > > > > > > > > > > >>
> > > > > > > > > > > >>
> > > > > > > > > > > >>> On Thu, Sep 22, 2016 at
7:49 PM, Ted Yu <
> > > > > yuzhihong@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > >>>
> > > > > > > > > > > >>> If MR framework is not
deployed in the cluster,
> hbase
> > > > still
> > > > > > > > > functions
> > > > > > > > > > > >>> normally (post merge).
> > > > > > > > > > > >>>
> > > > > > > > > > > >>> In terms of build time
dependency, we have long
> been
> > > > > > depending
> > > > > > > on
> > > > > > > > > > > >>> mapreduce. Take a look
at ExportSnapshot.
> > > > > > > > > > > >>>
> > > > > > > > > > > >>> Cheers
> > > > > > > > > > > >>>
> > > > > > > > > > > >>> On Thu, Sep 22, 2016 at
7:42 PM, Heng Chen <
> > > > > > > > > heng.chen.1986@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > >>> wrote:
> > > > > > > > > > > >>>
> > > > > > > > > > > >>>> In our production
cluster,  it is a common case we
> > > just
> > > > > have
> > > > > > > > HDFS
> > > > > > > > > > and
> > > > > > > > > > > >>>> HBase deployed.
> > > > > > > > > > > >>>> If our Master/RS depend
on MR framework
> (especially
> > > some
> > > > > > > > features
> > > > > > > > > we
> > > > > > > > > > > >>>> have not used at all),
 it introduced another cost
> > for
> > > > > > > maintain.
> > > > > > > > > I
> > > > > > > > > > > >>>> don't think it is
a good idea.
> > > > > > > > > > > >>>>
> > > > > > > > > > > >>>> 2016-09-23 10:28 GMT+08:00
张铎 <
> > palomino219@gmail.com
> > > >:
> > > > > > > > > > > >>>>> To be specific,
for example, our nice
> > Backup/Restore
> > > > > > feature,
> > > > > > > > if
> > > > > > > > > we
> > > > > > > > > > > >>> think
> > > > > > > > > > > >>>>> this is not a
core feature of HBase, then we
> could
> > > make
> > > > > it
> > > > > > > > depend
> > > > > > > > > > on
> > > > > > > > > > > >>> MR,
> > > > > > > > > > > >>>>> and start a standalone
BackupManager instance
> that
> > > > > submits
> > > > > > MR
> > > > > > > > > jobs
> > > > > > > > > > to
> > > > > > > > > > > >>> do
> > > > > > > > > > > >>>>> periodical maintenance
job. And if we think this
> > is a
> > > > > core
> > > > > > > > > feature
> > > > > > > > > > > that
> > > > > > > > > > > >>>>> everyone should
use it, then we'd better
> implement
> > it
> > > > > > without
> > > > > > > > MR
> > > > > > > > > > > >>>>> dependency, like
DLS.
> > > > > > > > > > > >>>>>
> > > > > > > > > > > >>>>> Thanks.
> > > > > > > > > > > >>>>>
> > > > > > > > > > > >>>>> 2016-09-23 10:11
GMT+08:00 张铎 <
> > palomino219@gmail.com
> > > >:
> > > > > > > > > > > >>>>>
> > > > > > > > > > > >>>>>> I‘m -1 on
let master or rs launch MR jobs. It is
> > OK
> > > > that
> > > > > > > some
> > > > > > > > of
> > > > > > > > > > our
> > > > > > > > > > > >>>>>> features depend
on MR but I think the bottom
> line
> > is
> > > > > that
> > > > > > we
> > > > > > > > > > should
> > > > > > > > > > > >>>> launch
> > > > > > > > > > > >>>>>> the jobs from
outside manually or by other
> > services.
> > > > > > > > > > > >>>>>>
> > > > > > > > > > > >>>>>> 2016-09-23
9:47 GMT+08:00 Andrew Purtell <
> > > > > > > > > > andrew.purtell@gmail.com
> > > > > > > > > > > >:
> > > > > > > > > > > >>>>>>
> > > > > > > > > > > >>>>>>> Ok, got
it. Well "shelling out" is on the line
> I
> > > > think,
> > > > > > so
> > > > > > > a
> > > > > > > > > fair
> > > > > > > > > > > >>>>>>> question.
> > > > > > > > > > > >>>>>>>
> > > > > > > > > > > >>>>>>> Can this
be driven by a utility derived from
> Tool
> > > > like
> > > > > > our
> > > > > > > > > other
> > > > > > > > > > MR
> > > > > > > > > > > >>>> apps?
> > > > > > > > > > > >>>>>>> The issue
is needing the AccessController to
> > decide
> > > > if
> > > > > > > > allowed?
> > > > > > > > > > But
> > > > > > > > > > > >>>> nothing
> > > > > > > > > > > >>>>>>> prevents
the user from running the job
> > > > > > > > manually/independently,
> > > > > > > > > > > right?
> > > > > > > > > > > >>>>>>>
> > > > > > > > > > > >>>>>>>> On
Sep 22, 2016, at 3:44 PM, Matteo Bertozzi <
> > > > > > > > > > > >>>> theo.bertozzi@gmail.com>
> > > > > > > > > > > >>>>>>> wrote:
> > > > > > > > > > > >>>>>>>>
> > > > > > > > > > > >>>>>>>> just
a remark. my query was not about tools
> > using
> > > MR
> > > > > > > > > (everyone i
> > > > > > > > > > > >>>> think
> > > > > > > > > > > >>>>>>> is
> > > > > > > > > > > >>>>>>>> ok
with those).
> > > > > > > > > > > >>>>>>>> the
topic was about: "are we ok with running
> MR
> > > jobs
> > > > > > from
> > > > > > > > > Master
> > > > > > > > > > > >>> and
> > > > > > > > > > > >>>> RSs
> > > > > > > > > > > >>>>>>>> code?"
since this will be the first time we do
> > > this
> > > > > > > > > > > >>>>>>>>
> > > > > > > > > > > >>>>>>>> Matteo
> > > > > > > > > > > >>>>>>>>
> > > > > > > > > > > >>>>>>>>
> > > > > > > > > > > >>>>>>>>>
On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das
> <
> > > > > > > > > > > >>> ddas@hortonworks.com>
> > > > > > > > > > > >>>>>>> wrote:
> > > > > > > > > > > >>>>>>>>>
> > > > > > > > > > > >>>>>>>>>
Very much agree; for tools like
> ExportSnapshot
> > /
> > > > > > Backup /
> > > > > > > > > > > Restore,
> > > > > > > > > > > >>>> it's
> > > > > > > > > > > >>>>>>>>>
fine to be dependent on MR. MR is the right
> > > > framework
> > > > > > for
> > > > > > > > > such.
> > > > > > > > > > > We
> > > > > > > > > > > >>>>>>> should
> > > > > > > > > > > >>>>>>>>>
also do compactions using MR (just saying :)
> )
> > > > > > > > > > > >>>>>>>>>
________________________________________
> > > > > > > > > > > >>>>>>>>>
From: Ted Yu <yuzhihong@gmail.com>
> > > > > > > > > > > >>>>>>>>>
Sent: Thursday, September 22, 2016 2:00 PM
> > > > > > > > > > > >>>>>>>>>
To: dev@hbase.apache.org
> > > > > > > > > > > >>>>>>>>>
Subject: Re: [DISCUSSION] MR jobs started by
> > > Master
> > > > > or
> > > > > > RS
> > > > > > > > > > > >>>>>>>>>
> > > > > > > > > > > >>>>>>>>>
I agree - backup / restore is in the same
> > > category
> > > > as
> > > > > > > > import
> > > > > > > > > /
> > > > > > > > > > > >>>> export.
> > > > > > > > > > > >>>>>>>>>
> > > > > > > > > > > >>>>>>>>>
On Thu, Sep 22, 2016 at 1:58 PM, Andrew
> > Purtell <
> > > > > > > > > > > >>>>>>> andrew.purtell@gmail.com>
> > > > > > > > > > > >>>>>>>>>
wrote:
> > > > > > > > > > > >>>>>>>>>
> > > > > > > > > > > >>>>>>>>>>
Backup is extra tooling around core in my
> > > opinion.
> > > > > > Like
> > > > > > > > > import
> > > > > > > > > > > or
> > > > > > > > > > > >>>>>>> export.
> > > > > > > > > > > >>>>>>>>>>
Or the optional MOB tool. It's fine.
> > > > > > > > > > > >>>>>>>>>>
> > > > > > > > > > > >>>>>>>>>>>
On Sep 22, 2016, at 1:50 PM, Matteo
> Bertozzi
> > <
> > > > > > > > > > > >>>> mbertozzi@apache.org>
> > > > > > > > > > > >>>>>>>>>>
wrote:
> > > > > > > > > > > >>>>>>>>>>>
> > > > > > > > > > > >>>>>>>>>>>
What's the latest opinion around running MR
> > > jobs
> > > > > from
> > > > > > > > hbase
> > > > > > > > > > > >>>> (Master
> > > > > > > > > > > >>>>>>> or
> > > > > > > > > > > >>>>>>>>>>
RS)?
> > > > > > > > > > > >>>>>>>>>>>
> > > > > > > > > > > >>>>>>>>>>>
I remember in the past that there was
> > > discussion
> > > > > > about
> > > > > > > > not
> > > > > > > > > > > >>> having
> > > > > > > > > > > >>>> MR
> > > > > > > > > > > >>>>>>>>>
has
> > > > > > > > > > > >>>>>>>>>>>
direct dependency of hbase.
> > > > > > > > > > > >>>>>>>>>>>
> > > > > > > > > > > >>>>>>>>>>>
I think some of discussion where around MOB
> > > that
> > > > > had
> > > > > > a
> > > > > > > MR
> > > > > > > > > job
> > > > > > > > > > > to
> > > > > > > > > > > >>>>>>>>>
compact,
> > > > > > > > > > > >>>>>>>>>>>
that later was transformed in a non-MR job
> to
> > > be
> > > > > > > merged,
> > > > > > > > I
> > > > > > > > > > > think
> > > > > > > > > > > >>>> we
> > > > > > > > > > > >>>>>>>>>
had a
> > > > > > > > > > > >>>>>>>>>>>
similar discussion for log split/replay.
> > > > > > > > > > > >>>>>>>>>>>
> > > > > > > > > > > >>>>>>>>>>>
the latest is the new Backup feature
> > > > (HBASE-7912),
> > > > > > that
> > > > > > > > > runs
> > > > > > > > > > a
> > > > > > > > > > > >>> MR
> > > > > > > > > > > >>>> job
> > > > > > > > > > > >>>>>>>>>>
from
> > > > > > > > > > > >>>>>>>>>>>
the master to copy data or restore data.
> > > > > > > > > > > >>>>>>>>>>>
(backup is also "not really core" as in..
> if
> > > you
> > > > > > don't
> > > > > > > > use
> > > > > > > > > > > >>> backup
> > > > > > > > > > > >>>>>>>>>
you'll
> > > > > > > > > > > >>>>>>>>>>>
not end up running MR jobs, but this was
> > > probably
> > > > > > true
> > > > > > > > for
> > > > > > > > > > MOB
> > > > > > > > > > > >>> as
> > > > > > > > > > > >>>> in
> > > > > > > > > > > >>>>>>>>>
"if
> > > > > > > > > > > >>>>>>>>>>>
you don't enable MOB you don't need MR")
> > > > > > > > > > > >>>>>>>>>>>
> > > > > > > > > > > >>>>>>>>>>>
any thoughts? do we a rule that says "we
> > don't
> > > > want
> > > > > > to
> > > > > > > > have
> > > > > > > > > > > >>> hbase
> > > > > > > > > > > >>>> run
> > > > > > > > > > > >>>>>>>>>
MR
> > > > > > > > > > > >>>>>>>>>>>
jobs, only tool started manually by the
> user
> > > can
> > > > do
> > > > > > > > that".
> > > > > > > > > or
> > > > > > > > > > > >>> can
> > > > > > > > > > > >>>> we
> > > > > > > > > > > >>>>>>>>>>
start
> > > > > > > > > > > >>>>>>>>>>>
adding MR calls around without problems?
> > > > > > > > > > > >>>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message