hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Groschupf ...@media-style.com>
Subject Re: scalability limits getDetails, mapFile Readers?
Date Wed, 01 Mar 2006 23:28:35 GMT
ups, sorry my mistake. I will post it in nutch- dev again.

Am 02.03.2006 um 00:25 schrieb Doug Cutting:

> Stefan,
>
> I think you meant to send this to nutch-dev, not hadoop-dev.
>
> Doug
>
> Stefan Groschupf wrote:
>> Hi,
>> We run into a problem with nutch using   
>> MapFileOutputFormat#getReaders  and getEntry.
>> In detail this happens until summary generation where we open for   
>> each segment as much readers as much parts (part-0000 to part-n)  
>> we  have.
>> Having 80 tasktracker and 80 segments means:
>> 80 x 80 x 4 (parseData, parseText, content, crawl). A search  
>> server   also needs to open as much files as required for the  
>> index searcher.
>> So the problem is a FileNotFoundException, (Too many open files).
>> Opening and closing Readers for each Detail makes no sense. We  
>> may  can limit the number of readers somehow and close the readers  
>> that  wasn't used since the longest time.
>> But I'm not that happy with this solution, so any thoughts how we  
>> can  solve this problem in general?
>> Thanks.
>> Stefan
>> ---------------------------------------------
>> blog: http://www.find23.org
>> company: http://www.media-style.com
>

---------------------------------------------
blog: http://www.find23.org
company: http://www.media-style.com



Mime
View raw message