hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Torsten Curdt <tcu...@apache.org>
Subject Re: Yahoo's production webmap is now on Hadoop
Date Tue, 19 Feb 2008 18:25:46 GMT
Wow! Congrats!

On 19.02.2008, at 18:58, Owen O'Malley wrote:

> The link inversion and ranking algorithms for Yahoo Search are now  
> being generated on Hadoop:
>
> http://developer.yahoo.com/blogs/hadoop/2008/02/yahoo-worlds- 
> largest-production-hadoop.html
>
> Some Webmap size data:
>
>     * Number of links between pages in the index: roughly 1  
> trillion links
>     * Size of output: over 300 TB, compressed!
>     * Number of cores used to run a single Map-Reduce job: over 10,000
>     * Raw disk used in the production cluster: over 5 Petabytes
>


Mime
View raw message