hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: scalability limits getDetails, mapFile Readers?
Date Wed, 01 Mar 2006 23:25:48 GMT
Stefan,

I think you meant to send this to nutch-dev, not hadoop-dev.

Doug

Stefan Groschupf wrote:
> Hi,
> 
> We run into a problem with nutch using  MapFileOutputFormat#getReaders  
> and getEntry.
> In detail this happens until summary generation where we open for  each 
> segment as much readers as much parts (part-0000 to part-n) we  have.
> Having 80 tasktracker and 80 segments means:
> 80 x 80 x 4 (parseData, parseText, content, crawl). A search server   
> also needs to open as much files as required for the index searcher.
> So the problem is a FileNotFoundException, (Too many open files).
> 
> Opening and closing Readers for each Detail makes no sense. We may  can 
> limit the number of readers somehow and close the readers that  wasn't 
> used since the longest time.
> But I'm not that happy with this solution, so any thoughts how we can  
> solve this problem in general?
> 
> Thanks.
> Stefan
> 
> 
> ---------------------------------------------
> blog: http://www.find23.org
> company: http://www.media-style.com
> 
> 

Mime
View raw message