Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 6415 invoked from network); 6 Dec 2007 22:14:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Dec 2007 22:14:57 -0000 Received: (qmail 29140 invoked by uid 500); 6 Dec 2007 22:14:44 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 29123 invoked by uid 500); 6 Dec 2007 22:14:44 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 29114 invoked by uid 99); 6 Dec 2007 22:14:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Dec 2007 14:14:44 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [76.96.30.96] (HELO QMTA09.emeryville.ca.mail.comcast.net) (76.96.30.96) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Dec 2007 22:14:22 +0000 Received: from OMTA08.emeryville.ca.mail.comcast.net ([76.96.30.12]) by QMTA09.emeryville.ca.mail.comcast.net with comcast id MZiG1Y0030FhH240A04o00; Thu, 06 Dec 2007 22:14:28 +0000 Received: from [192.168.168.15] ([76.103.181.218]) by OMTA08.emeryville.ca.mail.comcast.net with comcast id MaEU1Y00F4j7bz80800000; Thu, 06 Dec 2007 22:14:28 +0000 X-Authority-Analysis: v=1.0 c=1 a=xD6KBsERe9PC4bXL0pEA:9 a=Fgstjd0_M-V2H7nY6QVRoM4RJGwA:4 a=hVsco9iVOKcA:10 Message-ID: <4758743E.5040903@apache.org> Date: Thu, 06 Dec 2007 14:14:22 -0800 From: Doug Cutting User-Agent: Thunderbird 2.0.0.6 (X11/20071022) MIME-Version: 1.0 To: hadoop-user@lucene.apache.org Subject: Re: Mapper Out of Memory References: <701379.39182.qm@web57202.mail.re3.yahoo.com> In-Reply-To: <701379.39182.qm@web57202.mail.re3.yahoo.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Rui Shi wrote: > It is hard to believe that you need to enlarge heap size given the input size is only 10MB. In particular, you don't load all input at the same time. As for the program logic, no much fancy stuff, mostly cut and sorting. So GC should be able to handle... Out-of-memory exceptions can also be caused by having too many files open at once. What does 'ulimit -n' show? You presented an excerpt from a jobtracker log, right? What do the tasktracker logs show? Can you monitor a node while it is running to see whether the jvm's heap is growing, or whether the number of open files (lsof -p) is large? Also, can you please provide more details about your application? I.e., what is your inputformat, map function, etc. Doug