hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Verma <vermaabhish...@gmail.com>
Subject Re: Data World Record Falls as Computer Scientists Break Terabyte Sort Barrier
Date Mon, 02 Aug 2010 20:50:52 GMT
It shows how further behind Hadoop is in terms of performance. Are there
people working on finding the bottlenecks and making it more efficient? Are
there any JIRA issues related to this?

-Abhishek.

On Mon, Aug 2, 2010 at 11:47 AM, Arun C Murthy <acm@yahoo-inc.com> wrote:

> The UCSD results are very impressive, especially given their hardware
> budget.
>
> I may be wrong, but I'm pretty sure there were no Hadoop based entries this
> year - I know we at Yahoo! didn't enter.
>
> Couple of points:
> # The Indy category is a benchmark to sort fixed length records, not a
> _general_ sort benchmark i.e. Daytona.
> # Our _best_ result missed the deadline by a whisker last year, but we
> eventually did 100Tb sort in 95 mins and a 1000TB (1PB) in 975 mins (16.25
> hrs) - which worked out to be just over 1.0 TB/min, which was nearly twice
> as fast as the record attributed to us. (
> http://developer.yahoo.net/blogs/hadoop/2009/05/hadoop_sorts_a_petabyte_in_162.html
> )
>
> Arun
>
>
> On Aug 2, 2010, at 10:34 AM, Abhishek Verma wrote:
>
>  Hi Maxim,
>>
>> Hadoop was not involved. You can find more details here :
>> http://sortbenchmark.org/tritonsort_2010_May_15.pdf
>> and all the records and their information here :
>> http://sortbenchmark.org/
>>
>> <http://sortbenchmark.org/>-Abhishek.
>>
>> On Mon, Aug 2, 2010 at 9:52 AM, Maxim Veksler <maxim@vekslers.org> wrote:
>>
>>  Hi,
>>>
>>> Anyone knows if Hadoop is involved? And if so what is the configuration
>>> for
>>> such cluster?
>>>
>>> http://ucsdnews.ucsd.edu/newsrel/science/07-27DataWorld.asp
>>>
>>>
>>> Thank you,
>>> Maxim.
>>>
>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message