From general-return-1883-apmail-hadoop-general-archive=hadoop.apache.org@hadoop.apache.org Mon Aug 02 20:51:21 2010 Return-Path: Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: (qmail 30527 invoked from network); 2 Aug 2010 20:51:20 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 2 Aug 2010 20:51:20 -0000 Received: (qmail 6570 invoked by uid 500); 2 Aug 2010 20:51:19 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 6474 invoked by uid 500); 2 Aug 2010 20:51:19 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 6363 invoked by uid 99); 2 Aug 2010 20:51:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Aug 2010 20:51:19 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vermaabhishekp@gmail.com designates 74.125.82.179 as permitted sender) Received: from [74.125.82.179] (HELO mail-wy0-f179.google.com) (74.125.82.179) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Aug 2010 20:51:13 +0000 Received: by wyb42 with SMTP id 42so3567570wyb.38 for ; Mon, 02 Aug 2010 13:50:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=1/VwaGHdj1Fo/3uBF5GvbDxdctUZk8+oPFBUJPd9OFw=; b=tSiMFCwjnpu8GLka8BE5wWJ3amr34vzMaEY2kK5d8Rf0jT9R4DP6Jd7zUf8L+NIxS9 MAvizR6yTmoQCGTZFdqIcEZLDp6cQIz0/axBk4ra3sVC+brL6DZUnVOwO5aKT5kRx3/i eqyDEpUMo5XfyxMWzv2JEk4K+cTTQaae5cXUc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=T59p9EYwdElfi8WSXyARpsJ7IOuS5HfWhvnZC7JY0CBLnnd/1ViZ3oNO7rLGWJhNwP cw/Py+RjFRA7ED7T7rjT1uS835TYKTkEgwBi7F+sLAJJ+1//QpwOsZey9VJYzm3DucQ+ Py8Ol6n3iZ7O6AwQ3oMuLX8n6Hpzbzoeqk81k= MIME-Version: 1.0 Received: by 10.227.136.69 with SMTP id q5mr2314061wbt.202.1280782252977; Mon, 02 Aug 2010 13:50:52 -0700 (PDT) Received: by 10.227.144.205 with HTTP; Mon, 2 Aug 2010 13:50:52 -0700 (PDT) In-Reply-To: <17C47A26-3B02-4BEA-B862-7849391A6D74@yahoo-inc.com> References: <17C47A26-3B02-4BEA-B862-7849391A6D74@yahoo-inc.com> Date: Mon, 2 Aug 2010 13:50:52 -0700 Message-ID: Subject: Re: Data World Record Falls as Computer Scientists Break Terabyte Sort Barrier From: Abhishek Verma To: general@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e64984503931c5048cdd5bf8 X-Virus-Checked: Checked by ClamAV on apache.org --0016e64984503931c5048cdd5bf8 Content-Type: text/plain; charset=ISO-8859-1 It shows how further behind Hadoop is in terms of performance. Are there people working on finding the bottlenecks and making it more efficient? Are there any JIRA issues related to this? -Abhishek. On Mon, Aug 2, 2010 at 11:47 AM, Arun C Murthy wrote: > The UCSD results are very impressive, especially given their hardware > budget. > > I may be wrong, but I'm pretty sure there were no Hadoop based entries this > year - I know we at Yahoo! didn't enter. > > Couple of points: > # The Indy category is a benchmark to sort fixed length records, not a > _general_ sort benchmark i.e. Daytona. > # Our _best_ result missed the deadline by a whisker last year, but we > eventually did 100Tb sort in 95 mins and a 1000TB (1PB) in 975 mins (16.25 > hrs) - which worked out to be just over 1.0 TB/min, which was nearly twice > as fast as the record attributed to us. ( > http://developer.yahoo.net/blogs/hadoop/2009/05/hadoop_sorts_a_petabyte_in_162.html > ) > > Arun > > > On Aug 2, 2010, at 10:34 AM, Abhishek Verma wrote: > > Hi Maxim, >> >> Hadoop was not involved. You can find more details here : >> http://sortbenchmark.org/tritonsort_2010_May_15.pdf >> and all the records and their information here : >> http://sortbenchmark.org/ >> >> -Abhishek. >> >> On Mon, Aug 2, 2010 at 9:52 AM, Maxim Veksler wrote: >> >> Hi, >>> >>> Anyone knows if Hadoop is involved? And if so what is the configuration >>> for >>> such cluster? >>> >>> http://ucsdnews.ucsd.edu/newsrel/science/07-27DataWorld.asp >>> >>> >>> Thank you, >>> Maxim. >>> >>> > --0016e64984503931c5048cdd5bf8--