Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of liliwu@gmail.com designates
 209.85.200.175 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=message-id:date:from:to:subject:cc:mime-version:content-type;
        b=UnVO/GKTfGkmDp5IZclXAP5D51ge9fiIESHc+/kzMInoCtvD5hxRM+n5goI6RYDCPzmi5armRBgmJ2ed0K4U1Uf21aIKRKVPbNWnycSk1P5jtf4AWFY+j3ZpnD8qCrEYFpaqM/8kTUOjTnE/0vRflg7vGuyirNKcUHJHwF1PfzQ=
Message-ID: <7b2728090804301339q46421e6eobd9f1d040d1a0fcc@mail.gmail.com>
Date: Wed, 30 Apr 2008 13:39:27 -0700
From: "Lili Wu" <liliwu@gmail.com>
To: core-user@hadoop.apache.org
Subject: OOM error with large # of map tasks
Cc: samr@ning.com
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_4659_21130744.1209587967719"

------=_Part_4659_21130744.1209587967719
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

We are using hadoop 0.16 and are seeing a consistent problem:  out of memory
errors when we have a large # of map tasks.
The specifics of what is submitted when we reproduce this:

three large jobs:
1. 20,000 map tasks and 10 reduce tasks
2. 17,000 map tasks and 10 reduce tasks
3. 10,000 map tasks and 10 reduce tasks

these are at normal priority and periodically we swap the priorities around
to get some tasks started by each and let them complete.
other smaller jobs come  and go every hour or so (no more than 200 map
tasks, 4-10 reducers).

Our cluster consists of 23 nodes and we have 69 map tasks and 69 reduce
tasks.
Eventually, we see consistent oom errors in the task logs and the task
tracker itself goes down on as many as 14 of our nodes.

We examined a heap dump after one of these crashes of a TaskTracker and
found something interesting--there were 572 instances of JobConf's that
accounted for 940mb of String objects.   This seems quite odd that there are
so many instances of JobConf.  It seems to correlate with task in the
COMMIT_PENDING state as shown on the status for a task tracker node.  Has
anyone observed something like this?  can anyone explain what would cause
tasks to remain in this state? (which also apparently is in-memory vs
serialized to disk...).   In general, what does COMMIT_PENDING mean?  (job
done, but output not committed to dfs?)

Thanks!

------=_Part_4659_21130744.1209587967719--