hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruba Borthakur" <dhr...@yahoo-inc.com>
Subject RE: Max number of files in HDFS?
Date Tue, 28 Aug 2007 07:50:42 GMT
Running fsck and invoking getTotalFiles() seems to the right way to figure
out the total number of files in the dfs.


-----Original Message-----
From: Taeho Kang [mailto:tkang1@gmail.com] 
Sent: Monday, August 27, 2007 11:59 PM
To: hadoop-user@lucene.apache.org
Cc: tkang1@gmail.com
Subject: Max number of files in HDFS?

Dear All,

Hi, my name is Taeho and I am trying to figure out the maximum number of
files a namenode can hold.
The main reason for doing this is that I want to have some estimates on how
many files I can put into the HDFS without overflowing the Namenode
machine's memory.

I know the number depends on the size of memory and how much is allocated
for the running JVM.
For the memory usage by the namenode, I can simply use Runtime object of
For the total number of files residing in the DFS, I am thinking of using
getTotailfiles() funcion of NamenodeFsck class in
org.apache.hadoop.dfspacakge. Am I correct here in using NamenodeFsck?

Or, has anybody done similar experiments?

Any comments/suggestions will be appreciated.
Thanks in advance.
Best Regards,

Taeho Kang
Software Engineer, NHN Corporation, Seoul, South Korea
Homepage : tkang.blogspot.com

View raw message