hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gert Pfeifer <pfei...@se.inf.tu-dresden.de>
Subject Name node heap space problem
Date Wed, 16 Jul 2008 12:32:50 GMT
I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
and one secondary name node.

I have 1788874 files and directories, 1465394 blocks = 3254268 total.
Heap Size max is 3.47 GB.

My problem is that I produce many small files. Therefore I have a cron
job which just runs daily across the new files and copies them into
bigger files and deletes the small files.

Apart from this program, even a fsck kills the cluster.

The problem is that, as soon as I start this program, the heap space of
the name node reaches 100 %.

What could be the problem? There are not many small files right now and
still it doesn't work. I guess we have this problem since the upgrade to

Here is some additional data about the DFS:
Capacity	 :	 2 TB
DFS Remaining	:	1.19 TB
DFS Used	:	719.35 GB
DFS Used%	:	35.16 %

Thanks for hints,

View raw message