hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <andrew.purt...@gmail.com>
Subject Re: [DISCUSSION] MR jobs started by Master or RS
Date Fri, 23 Sep 2016 03:15:29 GMT
No, this misses Matteo's finer point, which is "shelling out" from the master directly to run
MR is a first. Why not drive this with a utility derived from Tool?

On Sep 22, 2016, at 7:57 PM, Vladimir Rodionov <vladrodionov@gmail.com> wrote:

>>> In our production cluster,  it is a common case we just have HDFS and
>>> HBase deployed.
>>> If our Master/RS depend on MR framework (especially some features we
>>> have not used at all),  it introduced another cost for maintain.  I
>>> don't think it is a good idea.
> 
> So , you are not backup users in this case. Many our customers have full
> stack deployed and
> want see backup to be a standard feature. Besides this, nothing will happen
> in your cluster
> if you won't be doing backups.
> 
> This discussion (we do not want see M/R dependency) goes to nowhere. We
> asked already, at least twice, to suggest another framework (other than M/R)
> for bulk data copy with *conversion*. Still waiting for suggestions.
> 
> -Vlad
> 
> 
> 
> 
>> On Thu, Sep 22, 2016 at 7:49 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> 
>> If MR framework is not deployed in the cluster, hbase still functions
>> normally (post merge).
>> 
>> In terms of build time dependency, we have long been depending on
>> mapreduce. Take a look at ExportSnapshot.
>> 
>> Cheers
>> 
>> On Thu, Sep 22, 2016 at 7:42 PM, Heng Chen <heng.chen.1986@gmail.com>
>> wrote:
>> 
>>> In our production cluster,  it is a common case we just have HDFS and
>>> HBase deployed.
>>> If our Master/RS depend on MR framework (especially some features we
>>> have not used at all),  it introduced another cost for maintain.  I
>>> don't think it is a good idea.
>>> 
>>> 2016-09-23 10:28 GMT+08:00 张铎 <palomino219@gmail.com>:
>>>> To be specific, for example, our nice Backup/Restore feature, if we
>> think
>>>> this is not a core feature of HBase, then we could make it depend on
>> MR,
>>>> and start a standalone BackupManager instance that submits MR jobs to
>> do
>>>> periodical maintenance job. And if we think this is a core feature that
>>>> everyone should use it, then we'd better implement it without MR
>>>> dependency, like DLS.
>>>> 
>>>> Thanks.
>>>> 
>>>> 2016-09-23 10:11 GMT+08:00 张铎 <palomino219@gmail.com>:
>>>> 
>>>>> I‘m -1 on let master or rs launch MR jobs. It is OK that some of our
>>>>> features depend on MR but I think the bottom line is that we should
>>> launch
>>>>> the jobs from outside manually or by other services.
>>>>> 
>>>>> 2016-09-23 9:47 GMT+08:00 Andrew Purtell <andrew.purtell@gmail.com>:
>>>>> 
>>>>>> Ok, got it. Well "shelling out" is on the line I think, so a fair
>>>>>> question.
>>>>>> 
>>>>>> Can this be driven by a utility derived from Tool like our other
MR
>>> apps?
>>>>>> The issue is needing the AccessController to decide if allowed? But
>>> nothing
>>>>>> prevents the user from running the job manually/independently, right?
>>>>>> 
>>>>>>> On Sep 22, 2016, at 3:44 PM, Matteo Bertozzi <
>>> theo.bertozzi@gmail.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>> just a remark. my query was not about tools using MR (everyone
i
>>> think
>>>>>> is
>>>>>>> ok with those).
>>>>>>> the topic was about: "are we ok with running MR jobs from Master
>> and
>>> RSs
>>>>>>> code?" since this will be the first time we do this
>>>>>>> 
>>>>>>> Matteo
>>>>>>> 
>>>>>>> 
>>>>>>>> On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das <
>> ddas@hortonworks.com>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Very much agree; for tools like ExportSnapshot / Backup /
Restore,
>>> it's
>>>>>>>> fine to be dependent on MR. MR is the right framework for
such. We
>>>>>> should
>>>>>>>> also do compactions using MR (just saying :) )
>>>>>>>> ________________________________________
>>>>>>>> From: Ted Yu <yuzhihong@gmail.com>
>>>>>>>> Sent: Thursday, September 22, 2016 2:00 PM
>>>>>>>> To: dev@hbase.apache.org
>>>>>>>> Subject: Re: [DISCUSSION] MR jobs started by Master or RS
>>>>>>>> 
>>>>>>>> I agree - backup / restore is in the same category as import
/
>>> export.
>>>>>>>> 
>>>>>>>> On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell <
>>>>>> andrew.purtell@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Backup is extra tooling around core in my opinion. Like
import or
>>>>>> export.
>>>>>>>>> Or the optional MOB tool. It's fine.
>>>>>>>>> 
>>>>>>>>>> On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi <
>>> mbertozzi@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> What's the latest opinion around running MR jobs
from hbase
>>> (Master
>>>>>> or
>>>>>>>>> RS)?
>>>>>>>>>> 
>>>>>>>>>> I remember in the past that there was discussion
about not
>> having
>>> MR
>>>>>>>> has
>>>>>>>>>> direct dependency of hbase.
>>>>>>>>>> 
>>>>>>>>>> I think some of discussion where around MOB that
had a MR job to
>>>>>>>> compact,
>>>>>>>>>> that later was transformed in a non-MR job to be
merged, I think
>>> we
>>>>>>>> had a
>>>>>>>>>> similar discussion for log split/replay.
>>>>>>>>>> 
>>>>>>>>>> the latest is the new Backup feature (HBASE-7912),
that runs a
>> MR
>>> job
>>>>>>>>> from
>>>>>>>>>> the master to copy data or restore data.
>>>>>>>>>> (backup is also "not really core" as in.. if you
don't use
>> backup
>>>>>>>> you'll
>>>>>>>>>> not end up running MR jobs, but this was probably
true for MOB
>> as
>>> in
>>>>>>>> "if
>>>>>>>>>> you don't enable MOB you don't need MR")
>>>>>>>>>> 
>>>>>>>>>> any thoughts? do we a rule that says "we don't want
to have
>> hbase
>>> run
>>>>>>>> MR
>>>>>>>>>> jobs, only tool started manually by the user can
do that". or
>> can
>>> we
>>>>>>>>> start
>>>>>>>>>> adding MR calls around without problems?
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>> 
>> 

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message