Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: general@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of shv.hadoop@gmail.com
 designates 209.85.213.176 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <607B1CAA-A48A-406C-B67E-2CAC11C2319C@hortonworks.com>
References: <7B10DC9F-982F-43ED-9A28-F6C7BE47EA35@hortonworks.com>
	<CA+ULb+ttqkZsFWE3q4BXERRaVuPXseP_2_aYL0ubyPUBPPUFNg@mail.gmail.com>
	<363C7E2A-25B6-4828-9ABE-34E9780D476D@hortonworks.com>
	<CA+ULb+tHCT_Uk++fB-8aqAQTUJwvyhmTA_LQ7eZNi00EyJ8AvA@mail.gmail.com>
	<AAEF512E-0CC9-4B94-9871-40D2F5772D07@hortonworks.com>
	<CA+D5cGwSgebECr2k8ZG9tK8bWSY26WfVmcRC2p0MRPchd6dBpw@mail.gmail.com>
	<F7747436-37FE-470F-8F0A-86A3BC292D61@hortonworks.com>
	<4E84E56D.2010503@apache.org>
	<5388141F-213D-4842-8F36-42D37FE622A4@hortonworks.com>
	<4E859730.7020306@apache.org>
	<8F4138FC-449A-413C-9A9C-273B5CC7A647@hortonworks.com>
	<CA+ULb+uTFCuqprGaK8RqVuuXOKmgBcJ7gLDsR6wvuA2iW3boEw@mail.gmail.com>
	<CADY20s4v+7sfssDAPduTLoD28fPwLYzsipmxAzSyx7ip=SFrwg@mail.gmail.com>
	<607B1CAA-A48A-406C-B67E-2CAC11C2319C@hortonworks.com>
Date: Sat, 1 Oct 2011 19:13:59 -0700
Message-ID: 
 <CAKtuutE8qgrqySMnhS-KbYBE0p=UDa1tszKtLno+bsNEa5xzXw@mail.gmail.com>
Subject: Re: Update on hadoop-0.23
From: Konstantin Shvachko <shv.hadoop@gmail.com>
To: general@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I am very glad that the development and testing of 0.23 is going so well.
I see a lot of commits and hundreds of changes going in literally every day=
.
It is great to see the new technology building!

On the criticism of the 0.22 release.
Arun has a top-down view and I agree a lot of progress have been
achieved with the framework.
My bottom-up view is that you first need a reliable storage layer. If
the file system looses blocks or worse messes up with the image and/or
journals, the performance of the framework is your second problem. I
have said that before. Based on my experience it take time to
stabilize a file system. Anybody seen one that has been stabilized in
less than 2 years?
I do not see the 0.22 release as a wasted effort. And if the progress
with it contributes to the 0.23 rush I am twice as happy.

Thanks,
--Konstantin

On Fri, Sep 30, 2011 at 3:00 PM, Arun C Murthy <acm@hortonworks.com> wrote:
>
> On Sep 30, 2011, at 1:13 PM, Todd Lipcon wrote:
>
>> On Fri, Sep 30, 2011 at 11:44 AM, Roman Shaposhnik <rvs@apache.org> wrot=
e:
>>> I apologize if my level of institutional knowledge of these things is
>>> lacking, but do you have any
>>> benchmarking results between 0.22 and 0.20.2xx? The reason I'm asking
>>> is twofold -- I really
>>> would like to see an objective numbers qualifying the viability of
>>> 0.22 from the performance stand point,
>>> but more importantly I would really like to include the benchmarking
>>> code into Bigtop.
>>
>> 0.22 currently suffers from MAPREDUCE-2266, which, last time I
>> benchmarked it, caused a significant slowdown. iirc a terasort ran
>> something like twice as slow on my test cluster due to this bug.
>> 0.23/MR2 doesn't suffer from this bug.
>>
>
> I don't really know where to start. CHANGES.txt in branch-0.20-security h=
as the full list.
>
> If I remember right, long ago (late 2009) =A0we benchmarked .21 with grid=
mix and saw >30% prior to abandoning .21.
>
> Since then 0.20.2xx has had innumerable improvements to JobTracker, TaskT=
racker etc. etc.
> # JobTracker itself is almost thrice as fast as it used to be in 2009.
> # The scheduler is significantly better (>2x locality) and throughput.
> # TaskTracker has had innumerable fixes for dist.cache, task launch, shut=
down (MR-2266 and lots of other similar fixes).
> # The MR runtime has fixes for latency on innumerable fronts.
>
> Other regressions:
> # Security
> # Support for multi-tenant clusters.
> # Tonnes of operability fixes (jobhistory, task logs i.e. MR-1100) for ru=
nning MR clusters.
>
> The one redeeming aspect for .22 is the shuffle based on the work we did =
for winning Terasort/Petasort in 2009 but 0.23 has even more work there wit=
h zero-copy with netty (yaay! no more jetty! Thanks to @cdouglas).
>
>> In terms of bugs -- same question. Is there any publicly available
>> list of, at least, the critical
>> ones that make 0.22 not viable from your point of view?
>
> We marked a lot of them as blockers on .22 and they were discarded by the=
 release master(s). branch-0.20-security/CHANGES.txt is the full list. I re=
ally can't spend time enumerating over 4000 commits and > 2000 (?) jiras to=
 that branch at this point.
>
> In my opinion, as someone who has helped develop/run/support very large i=
nstalls and done this for over 5 1/2 years, a major release with regression=
 on features (security, multi-tenancy) and scalability, performance etc. is=
 distinctly _unviable_.
>
> ----
>
> Again, none of this is meant to say you should invest time on fixing them=
 or releasing 0.22 as it stands - just, please, don't label it in a manner =
which helps build unreasonable expectations among users about it's viabilit=
y & usability.
>
> thanks,
> Arun
>
>