ups, sorry my mistake. I will post it in nutch- dev again.
Am 02.03.2006 um 00:25 schrieb Doug Cutting:
> Stefan,
>
> I think you meant to send this to nutch-dev, not hadoop-dev.
>
> Doug
>
> Stefan Groschupf wrote:
>> Hi,
>> We run into a problem with nutch using
>> MapFileOutputFormat#getReaders and getEntry.
>> In detail this happens until summary generation where we open for
>> each segment as much readers as much parts (part-0000 to part-n)
>> we have.
>> Having 80 tasktracker and 80 segments means:
>> 80 x 80 x 4 (parseData, parseText, content, crawl). A search
>> server also needs to open as much files as required for the
>> index searcher.
>> So the problem is a FileNotFoundException, (Too many open files).
>> Opening and closing Readers for each Detail makes no sense. We
>> may can limit the number of readers somehow and close the readers
>> that wasn't used since the longest time.
>> But I'm not that happy with this solution, so any thoughts how we
>> can solve this problem in general?
>> Thanks.
>> Stefan
>> ---------------------------------------------
>> blog: http://www.find23.org
>> company: http://www.media-style.com
>
---------------------------------------------
blog: http://www.find23.org
company: http://www.media-style.com
|