Return-Path: X-Original-To: apmail-hadoop-general-archive@minotaur.apache.org Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 44CF495FC for ; Sun, 2 Oct 2011 02:14:27 +0000 (UTC) Received: (qmail 73358 invoked by uid 500); 2 Oct 2011 02:14:25 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 73302 invoked by uid 500); 2 Oct 2011 02:14:25 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 73294 invoked by uid 99); 2 Oct 2011 02:14:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 02 Oct 2011 02:14:25 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of shv.hadoop@gmail.com designates 209.85.213.176 as permitted sender) Received: from [209.85.213.176] (HELO mail-yx0-f176.google.com) (209.85.213.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 02 Oct 2011 02:14:20 +0000 Received: by yxn22 with SMTP id 22so3729519yxn.35 for ; Sat, 01 Oct 2011 19:14:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=/dTBvNtemxZPFEqi4atgT/yr2ApF3rqheOo3PFOH6Sc=; b=cRUEpIQFOtyNLjhrgeu2KYOOJorhUvDvgPwJgMb+C+aqt/J+PsRAXlowlpAiW+gVRR dHmxhwlXd+erzYGcClFQ+YbH3WERXijIoGkDhgAuV51Y1OZY2G/K1iSfz7COS3DvgJpZ TZSC+24JcqpRq/z2nWTl5KH5jv1bOZ3o6XGmo= MIME-Version: 1.0 Received: by 10.146.223.19 with SMTP id v19mr1391193yag.14.1317521639943; Sat, 01 Oct 2011 19:13:59 -0700 (PDT) Received: by 10.147.181.4 with HTTP; Sat, 1 Oct 2011 19:13:59 -0700 (PDT) In-Reply-To: <607B1CAA-A48A-406C-B67E-2CAC11C2319C@hortonworks.com> References: <7B10DC9F-982F-43ED-9A28-F6C7BE47EA35@hortonworks.com> <363C7E2A-25B6-4828-9ABE-34E9780D476D@hortonworks.com> <4E84E56D.2010503@apache.org> <5388141F-213D-4842-8F36-42D37FE622A4@hortonworks.com> <4E859730.7020306@apache.org> <8F4138FC-449A-413C-9A9C-273B5CC7A647@hortonworks.com> <607B1CAA-A48A-406C-B67E-2CAC11C2319C@hortonworks.com> Date: Sat, 1 Oct 2011 19:13:59 -0700 Message-ID: Subject: Re: Update on hadoop-0.23 From: Konstantin Shvachko To: general@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I am very glad that the development and testing of 0.23 is going so well. I see a lot of commits and hundreds of changes going in literally every day= . It is great to see the new technology building! On the criticism of the 0.22 release. Arun has a top-down view and I agree a lot of progress have been achieved with the framework. My bottom-up view is that you first need a reliable storage layer. If the file system looses blocks or worse messes up with the image and/or journals, the performance of the framework is your second problem. I have said that before. Based on my experience it take time to stabilize a file system. Anybody seen one that has been stabilized in less than 2 years? I do not see the 0.22 release as a wasted effort. And if the progress with it contributes to the 0.23 rush I am twice as happy. Thanks, --Konstantin On Fri, Sep 30, 2011 at 3:00 PM, Arun C Murthy wrote: > > On Sep 30, 2011, at 1:13 PM, Todd Lipcon wrote: > >> On Fri, Sep 30, 2011 at 11:44 AM, Roman Shaposhnik wrot= e: >>> I apologize if my level of institutional knowledge of these things is >>> lacking, but do you have any >>> benchmarking results between 0.22 and 0.20.2xx? The reason I'm asking >>> is twofold -- I really >>> would like to see an objective numbers qualifying the viability of >>> 0.22 from the performance stand point, >>> but more importantly I would really like to include the benchmarking >>> code into Bigtop. >> >> 0.22 currently suffers from MAPREDUCE-2266, which, last time I >> benchmarked it, caused a significant slowdown. iirc a terasort ran >> something like twice as slow on my test cluster due to this bug. >> 0.23/MR2 doesn't suffer from this bug. >> > > I don't really know where to start. CHANGES.txt in branch-0.20-security h= as the full list. > > If I remember right, long ago (late 2009) =A0we benchmarked .21 with grid= mix and saw >30% prior to abandoning .21. > > Since then 0.20.2xx has had innumerable improvements to JobTracker, TaskT= racker etc. etc. > # JobTracker itself is almost thrice as fast as it used to be in 2009. > # The scheduler is significantly better (>2x locality) and throughput. > # TaskTracker has had innumerable fixes for dist.cache, task launch, shut= down (MR-2266 and lots of other similar fixes). > # The MR runtime has fixes for latency on innumerable fronts. > > Other regressions: > # Security > # Support for multi-tenant clusters. > # Tonnes of operability fixes (jobhistory, task logs i.e. MR-1100) for ru= nning MR clusters. > > The one redeeming aspect for .22 is the shuffle based on the work we did = for winning Terasort/Petasort in 2009 but 0.23 has even more work there wit= h zero-copy with netty (yaay! no more jetty! Thanks to @cdouglas). > >> In terms of bugs -- same question. Is there any publicly available >> list of, at least, the critical >> ones that make 0.22 not viable from your point of view? > > We marked a lot of them as blockers on .22 and they were discarded by the= release master(s). branch-0.20-security/CHANGES.txt is the full list. I re= ally can't spend time enumerating over 4000 commits and > 2000 (?) jiras to= that branch at this point. > > In my opinion, as someone who has helped develop/run/support very large i= nstalls and done this for over 5 1/2 years, a major release with regression= on features (security, multi-tenancy) and scalability, performance etc. is= distinctly _unviable_. > > ---- > > Again, none of this is meant to say you should invest time on fixing them= or releasing 0.22 as it stands - just, please, don't label it in a manner = which helps build unreasonable expectations among users about it's viabilit= y & usability. > > thanks, > Arun > >