Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 48743 invoked from network); 19 Oct 2009 13:02:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 19 Oct 2009 13:02:45 -0000 Received: (qmail 48944 invoked by uid 500); 19 Oct 2009 13:02:43 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 48875 invoked by uid 500); 19 Oct 2009 13:02:43 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 48864 invoked by uid 99); 19 Oct 2009 13:02:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Oct 2009 13:02:42 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE,NO_RDNS_DOTCOM_HELO X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [69.147.107.21] (HELO mrout2-b.corp.re1.yahoo.com) (69.147.107.21) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Oct 2009 13:02:40 +0000 Received: from EGL-EX07CAS01.ds.corp.yahoo.com (egl-ex07cas01.eglbp.corp.yahoo.com [203.83.248.208]) by mrout2-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id n9JD14Aa037960 for ; Mon, 19 Oct 2009 06:01:04 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=received:from:to:date:subject:thread-topic:thread-index: message-id:in-reply-to:accept-language:content-language: x-ms-has-attach:x-ms-tnef-correlator:acceptlanguage:content-type:mime-version; b=ILQWg4dcPt6tH7OX8aLehhz+VbX+3dFIBkkEjzQOD2R/iHLWJZ//FSJwxgWgeq8s Received: from EGL-EX07VS01.ds.corp.yahoo.com ([203.83.248.205]) by EGL-EX07CAS01.ds.corp.yahoo.com ([203.83.248.215]) with mapi; Mon, 19 Oct 2009 18:31:03 +0530 From: Amogh Vasekar To: "common-user@hadoop.apache.org" Date: Mon, 19 Oct 2009 18:31:01 +0530 Subject: Re: Hadoop dfs can't allocate memory with enough hard disk space when data gets huge Thread-Topic: Hadoop dfs can't allocate memory with enough hard disk space when data gets huge Thread-Index: AcpQaB2aY6sUh2WtRTKjZS0KnB54BwAVBeUw Message-ID: In-Reply-To: <43980.31265.qm@web46215.mail.sp1.yahoo.com> Accept-Language: en-US Content-Language: en X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_C70260E5265Camoghyahooinccom_" MIME-Version: 1.0 --_000_C70260E5265Camoghyahooinccom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, It would be more helpful if you provide the exact error here. Also, hadoop uses the local FS to store intermediate data, along with HDFS = for final output. If your job is memory intensive, try limiting the number of tasks you are r= unning in parallel on a machine. Amogh On 10/19/09 8:27 AM, "Kunsheng Chen" wrote: I and running a hadoop program to perform MapReduce work on files inside a = folder. My program is basically doing Map and Reduce work, each line of any file is= a pair of string, and the result is a string associate with occurence insi= de all files. The program works fine until the number of files grow to about 80,000,then = the 'cannot allocate memory' error occur for some reason. Each of the file contains around 50 lines, but the total size of all files = is no more than 1.5 GB. There are 3 datanodes performing calculation,each o= f them have more than 10GB hd left. I am wondering if that is normal for Hadoop because the data is too large ?= Or it might be my programs problem ? It is really not supposed to be since Hadoop was developed for processing l= arge data sets. Any idea is well appreciated --_000_C70260E5265Camoghyahooinccom_--