hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Venkat Seeth <sv...@yahoo.com>
Subject Re: Strange behavior - One reduce out of N reduces always fail.
Date Thu, 22 Feb 2007 21:45:25 GMT
Thank you Sami Siren, Andrzej Bialecki, Devaraj Das
and Mahadev Konar for your inputs. I finally was able
to get past 1 million with 2 changes.

1. Reduced the document size significantly.
2. Increased the file-hanldle limit from 1024 to 4096.
These 2 did the magic.

I was able to successfully process 5 million docs.
Planning a test for processing 25 million. I'll keep
things posted.

Thanks,
Venkat

--- Andrzej Bialecki <ab@getopt.org> wrote:

> Venkat Seeth wrote:
> > Hi Andrzej,
> >
> > A quick question on your suggestion.
> >
> >   
> >>> Configuration:
> >>> I have about 128 maps and 8 reduces so I get to
> >>>       
> > create 8 partitions of my index. 
> >
> >   
> >> I think that with this configuration you could
> >>     
> > increase the number of 
> >   
> >> reduces, to decrease the amount of data each
> reduce
> >>     
> > task has to handle. 
> >   
> >> In your current config you run at most 2 reduces
> per
> >>     
> > machine.
> >
> > You suggested to increase the number of reduces. I
> did
> > come up with 8 partitions for my index each
> containing
> > about 10 million documents.
> >
> > Are you saying I could probably create 32
> partitions
> > and then later merge into smaller number of
> > partitions?
> >
> > If I have a huge number of partitions, I do not
> know
> > how it'll affect federating search across these
> large
> > number of indexes and merging the results from
> those
> > searches. 
> >
> > Any thoughts are greatly appreciated.
> >   
> 
> 
> The only reason I suggested to increase the number
> of reduces is to get 
> you past the memory problems. From the search
> performance point of view 
> you should definitely merge partial indexes.
> 
> -- 
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _  
> __________________________________
> [__ || __|__/|__||\/|  Information Retrieval,
> Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System
> Integration
> http://www.sigram.com  Contact: info at sigram dot
> com
> 
> 
> 



 
____________________________________________________________________________________
Never miss an email again!
Yahoo! Toolbar alerts you the instant new Mail arrives.
http://tools.search.yahoo.com/toolbar/features/mail/

Mime
View raw message