hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Strange behavior - One reduce out of N reduces always fail.
Date Wed, 21 Feb 2007 19:58:16 GMT
Venkat Seeth wrote:
> Hi Andrzej,
>
> A quick question on your suggestion.
>
>   
>>> Configuration:
>>> I have about 128 maps and 8 reduces so I get to
>>>       
> create 8 partitions of my index. 
>
>   
>> I think that with this configuration you could
>>     
> increase the number of 
>   
>> reduces, to decrease the amount of data each reduce
>>     
> task has to handle. 
>   
>> In your current config you run at most 2 reduces per
>>     
> machine.
>
> You suggested to increase the number of reduces. I did
> come up with 8 partitions for my index each containing
> about 10 million documents.
>
> Are you saying I could probably create 32 partitions
> and then later merge into smaller number of
> partitions?
>
> If I have a huge number of partitions, I do not know
> how it'll affect federating search across these large
> number of indexes and merging the results from those
> searches. 
>
> Any thoughts are greatly appreciated.
>   


The only reason I suggested to increase the number of reduces is to get 
you past the memory problems. From the search performance point of view 
you should definitely merge partial indexes.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Mime
View raw message