Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 526479708 for ; Thu, 23 Feb 2012 10:50:23 +0000 (UTC) Received: (qmail 20256 invoked by uid 500); 23 Feb 2012 10:50:18 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 19160 invoked by uid 500); 23 Feb 2012 10:50:17 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 18783 invoked by uid 99); 23 Feb 2012 10:50:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Feb 2012 10:50:17 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of zsongbo@gmail.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-tul01m020-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Feb 2012 10:50:13 +0000 Received: by obbwd18 with SMTP id wd18so1853278obb.35 for ; Thu, 23 Feb 2012 02:49:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=O83P10Xw+qJNMoJA2km4BBCxQJNnLj1UA6f/pCUvL4U=; b=amUPG1kdFxx4nzOiufdDcH0zQnAyik+mz9dZkUHyKOH4ZOTFx2otrmqtA4EvkgNt4u 5Xfy6Lnx4Vl3bVoKn+Iu7u4W9AryJO2OnUVJfSOXAzdAcJUN+AuFm71iOudqxWkIe3jn FDlC955hpNnAf1pWR4spVMZxRrTCid5rDhXu0= Received: by 10.182.121.101 with SMTP id lj5mr244470obb.39.1329994192146; Thu, 23 Feb 2012 02:49:52 -0800 (PST) MIME-Version: 1.0 Received: by 10.60.33.162 with HTTP; Thu, 23 Feb 2012 02:49:32 -0800 (PST) In-Reply-To: <20120222123405.6692fd28@intec.ugent.be> References: <20120222123405.6692fd28@intec.ugent.be> From: Schubert Zhang Date: Thu, 23 Feb 2012 18:49:32 +0800 Message-ID: Subject: Re: Optimized Hadoop To: common-user@hadoop.apache.org Cc: mapreduce-user@hadoop.apache.org, mapreduce-dev@hadoop.apache.org Content-Type: multipart/alternative; boundary=14dae93998f76051a504b99f6732 --14dae93998f76051a504b99f6732 Content-Type: text/plain; charset=ISO-8859-1 @Todd, Yes, in our first code tag, we intendedly keep away from the security and user-control feature. It is because in our existing deploys of production solutions in enterprise field, this feature is always turned off. I think it may be mainly because of the different business model between Hanborq and others. But, we really have plan to completely compat with Apache and Cloudera in the future. For the worker-pool implementation, it is true we will continue to improve our solution.... Schubert Zhang Looking at the code, it seems you only support the default task executor. Do you have plans to support run-as-user through the linux task-controller? It's a requirement for secure environments. But, it makes the worker pool model a little tougher since you can't share a JVM cross-user. On Wed, Feb 22, 2012 at 7:34 PM, Dieter Plaetinck < dieter.plaetinck@intec.ugent.be> wrote: > Great work folks! Very interesting. > > PS: did you notice if you google for "hanborq" or HDH it's very hard to > find your website, hanborq.com ? > > Dieter > > On Tue, 21 Feb 2012 02:17:31 +0800 > Schubert Zhang wrote: > > > We just update the slides of this improvements: > > > http://www.slideshare.net/hanborq/hanborq-optimizations-on-hadoop-mapreduce-20120216a > > > > Updates: > > (1) modified some describes to make things more clear and accuracy. > > (2) add some benchmarks to make sense. > > > > On Sat, Feb 18, 2012 at 11:12 PM, Anty wrote: > > > > > > > > > > > On Fri, Feb 17, 2012 at 3:27 AM, Todd Lipcon > wrote: > > > > > >> Hey Schubert, > > >> > > >> Looking at the code on github, it looks like your rewritten shuffle is > > >> in fact just a backport of the shuffle from MR2. I didn't look closely > > >> > > > > > > additionally, the rewritten shuffle in MR2 has some bugs, which harm > the > > > overall performance, for which I have already file a jira to report > this, > > > with a patch available. > > > MAPREDUCE-3685 > > > > > > > > > > > >> - are there any distinguishing factors? > > >> Also, the OOB heartbeat and adaptive heartbeat code seems to be the > > >> same as what's in 1.0? > > >> > > >> -Todd > > >> > > >> On Thu, Feb 16, 2012 at 9:44 AM, Schubert Zhang > > >> wrote: > > >> > Here is the presentation to describe our job, > > >> > > > >> > http://www.slideshare.net/hanborq/hanborq-optimizations-on-hadoop-mapreduce-20120216a > > >> > Wellcome to give your advises. > > >> > It's just a little step, and we are continue to do more > improvements, > > >> thanks > > >> > for your help. > > >> > > > >> > > > >> > > > >> > > > >> > On Thu, Feb 16, 2012 at 11:01 PM, Anty wrote: > > >> >> > > >> >> Hi: Guys > > >> >> We just deliver a optimized hadoop , if you are interested, > Pls > > >> >> refer to https://github.com/hanborq/hadoop > > >> >> > > >> >> -- > > >> >> Best Regards > > >> >> Anty Rao > > >> > > > >> > > > >> > > >> > > >> > > >> -- > > >> Todd Lipcon > > >> Software Engineer, Cloudera > > >> > > > > > > > > > > > > -- > > > Best Regards > > > Anty Rao > > > > > --14dae93998f76051a504b99f6732 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable @Todd,
Yes, in our first code tag, we intendedly keep away from the security and user-control feature.
It is because in our e= xisting deploys of production solutions in enterprise field, this feature i= s always turned off. I think it may be mainly because of the different busi= ness model between Hanborq and others.

But, we really have plan to completely compat with Apache and Cloudera = in the future.

For the worker-pool implementation, it is true we wil= l continue to improve our solution....

Schubert Zhang

Looking= at the code, it seems you only support the default task
executor. Do you have plans to support run-as-user through the linux
task-controller? It's a requirement for secure environments. But, it makes the worker pool model a little tougher since you can't share a JVM cross-user.



On Wed, Feb 22, 2012 at 7:34 PM, Die= ter Plaetinck <dieter.plaetinck@intec.ugent.be> wrote:
Great work folks! Very interesting.

PS: did you notice if you google for "hanborq" or HDH it's ve= ry hard to find your website, hanborq.com ?

Dieter

On Tue, 21 Feb 2012 02:17:31 +0800
Schubert Zhang <zsongbo@gmail.com> wrote:

> We just update the slides of this improvements:
>
http://www.slideshare.net/han= borq/hanborq-optimizations-on-hadoop-mapreduce-20120216a
>
> Updates:
> (1) modified some describes to make things more clear and accuracy. > (2) add some benchmarks to make sense.
>
> On Sat, Feb 18, 2012 at 11:12 PM, Anty <anty.rao@gmail.com> wrote:
>
> >
> >
> > On Fri, Feb 17, 2012 at 3:27 AM, Todd Lipcon <todd@cloudera.com> wrote:
> >
> >> Hey Schubert,
> >>
> >> Looking at the code on github, it looks like your rewritten s= huffle is
> >> in fact just a backport of the shuffle from MR2. I didn't= look closely
> >>
> >
> > additionally, the rewritten shuffle in MR2 has some bugs, which h= arm the
> > overall performance, for which I have already file a jira to repo= rt this,
> > with a patch available.
> > MAPREDUCE-3685 <https://issues.apache.org/jira/b= rowse/MAPREDUCE-3685>
> >
> >
> >
> >> - are there any distinguishing factors?
> >> Also, the OOB heartbeat and adaptive heartbeat code seems to = be the
> >> same as what's in 1.0?
> >>
> >> -Todd
> >>
> >> On Thu, Feb 16, 2012 at 9:44 AM, Schubert Zhang <zsongbo@gmail.com>
> >> wrote:
> >> > Here is the presentation to describe our job,
> >> >
> >> http://www.slideshar= e.net/hanborq/hanborq-optimizations-on-hadoop-mapreduce-20120216a
> >> > Wellcome to give your advises.
> >> > It's just a little step, and we are continue to do m= ore improvements,
> >> thanks
> >> > for your help.
> >> >
> >> >
> >> >
> >> >
> >> > On Thu, Feb 16, 2012 at 11:01 PM, Anty <anty.rao@gmail.com> wrote:
> >> >>
> >> >> Hi: Guys
> >> >> =A0 =A0 =A0 =A0We just deliver a optimized hadoop , = if you are interested, Pls
> >> >> refer to https://github.com/hanborq/hadoop
> >> >>
> >> >> --
> >> >> Best Regards
> >> >> Anty Rao
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >>
> >
> >
> >
> > --
> > Best Regards
> > Anty Rao
> >


--14dae93998f76051a504b99f6732--