hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devaraj Das <d...@hortonworks.com>
Subject Re: [DISCUSSION] MR jobs started by Master or RS
Date Thu, 22 Sep 2016 23:32:59 GMT
Not practical to do those tools without MR, JM. We should be using the right framework for
the use cases in hand. MR fits this really well. 
JM, when you say "if we can do without MR, then, why not?", do you have a framework in mind
that performs/scale as well as MR? Curious.
________________________________________
From: Jean-Marc Spaggiari <jean-marc@spaggiari.org>
Sent: Thursday, September 22, 2016 4:29 PM
To: dev
Subject: Re: [DISCUSSION] MR jobs started by Master or RS

Well, I'm just not using those features ;) But was hopping for the MOBs ;)
My point is, if we can do it without MR, then, why not? )

2016-09-22 19:25 GMT-04:00 Vladimir Rodionov <vladrodionov@gmail.com>:

> Forgot WALPlayer :)
>
> -Vlad
>
> On Thu, Sep 22, 2016 at 4:21 PM, Vladimir Rodionov <vladrodionov@gmail.com
> >
> wrote:
>
> > >> and
> > >> backups too, but don't want to bother having to install and configure
> > YARN
> > >> just for that, as well as removing resources from HBase to give it to
> >
> > Any suggestions on how to do bulk data move with transformation from/to
> > HBase cluster w/o MapReduce?
> >
> > Opposition to M/R does not make sense imo, as since we have a lot of
> tools
> > in HBase which depend on MapReduce:
> >
> > CountRows
> > CountCells
> > Import
> > Export
> > ImportTsv
> > ExportTsv
> > CopyTable
> > VerifyReplication
> > ExportSnapshot
> >
> > and new backup create/restore of course.
> >
> >
> > -Vlad
> >
> >
> >
> >
> > On Thu, Sep 22, 2016 at 4:15 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> >> My 2ยข: I have a strong preference for NOT having a dependency on MR
> >> anywhere :( I run my HBase cluste without YARN. Just HBase and HDFS. I
> >> like
> >> all the features that we built. Would love to be able to use MOBs and
> >> backups too, but don't want to bother having to install and configure
> YARN
> >> just for that, as well as removing resources from HBase to give it to
> >> yarn....
> >>
> >> JMS
> >>
> >> 2016-09-22 18:44 GMT-04:00 Matteo Bertozzi <theo.bertozzi@gmail.com>:
> >>
> >> > just a remark. my query was not about tools using MR (everyone i think
> >> is
> >> > ok with those).
> >> > the topic was about: "are we ok with running MR jobs from Master and
> RSs
> >> > code?" since this will be the first time we do this
> >> >
> >> > Matteo
> >> >
> >> >
> >> > On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das <ddas@hortonworks.com>
> >> wrote:
> >> >
> >> > > Very much agree; for tools like ExportSnapshot / Backup / Restore,
> >> it's
> >> > > fine to be dependent on MR. MR is the right framework for such. We
> >> should
> >> > > also do compactions using MR (just saying :) )
> >> > > ________________________________________
> >> > > From: Ted Yu <yuzhihong@gmail.com>
> >> > > Sent: Thursday, September 22, 2016 2:00 PM
> >> > > To: dev@hbase.apache.org
> >> > > Subject: Re: [DISCUSSION] MR jobs started by Master or RS
> >> > >
> >> > > I agree - backup / restore is in the same category as import /
> export.
> >> > >
> >> > > On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell <
> >> > andrew.purtell@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Backup is extra tooling around core in my opinion. Like import
or
> >> > export.
> >> > > > Or the optional MOB tool. It's fine.
> >> > > >
> >> > > > > On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi <
> >> mbertozzi@apache.org>
> >> > > > wrote:
> >> > > > >
> >> > > > > What's the latest opinion around running MR jobs from hbase
> >> (Master
> >> > or
> >> > > > RS)?
> >> > > > >
> >> > > > > I remember in the past that there was discussion about not
> having
> >> MR
> >> > > has
> >> > > > > direct dependency of hbase.
> >> > > > >
> >> > > > > I think some of discussion where around MOB that had a MR
job to
> >> > > compact,
> >> > > > > that later was transformed in a non-MR job to be merged,
I think
> >> we
> >> > > had a
> >> > > > > similar discussion for log split/replay.
> >> > > > >
> >> > > > > the latest is the new Backup feature (HBASE-7912), that
runs a
> MR
> >> job
> >> > > > from
> >> > > > > the master to copy data or restore data.
> >> > > > > (backup is also "not really core" as in.. if you don't use
> backup
> >> > > you'll
> >> > > > > not end up running MR jobs, but this was probably true for
MOB
> as
> >> in
> >> > > "if
> >> > > > > you don't enable MOB you don't need MR")
> >> > > > >
> >> > > > > any thoughts? do we a rule that says "we don't want to have
> hbase
> >> run
> >> > > MR
> >> > > > > jobs, only tool started manually by the user can do that".
or
> can
> >> we
> >> > > > start
> >> > > > > adding MR calls around without problems?
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
View raw message