hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wasif Riaz Malik <wma...@gmail.com>
Subject Re: Storing millions of small files
Date Tue, 22 May 2012 10:01:11 GMT

Hi Brendan,

The number of files that can be stored in HDFS is limited by the size of
the NameNode's RAM. The downside with storing small files is that you would
saturate the NameNode's RAM with a small data set (sum of the size of all
your small files). However, you can store around 100 million files (at
least) using 60GB of RAM at the NameNode. The downside with having a large
namespace is that the NameNode might take upto an hour to recover from
failures, but you can overcome this issue by using the HA Namenode.

Are you planning to store more than 100 million files?

Wasif Riaz Malik

On Tue, May 22, 2012 at 11:39 AM, Brendan cheng <ccp999@hotmail.com> wrote:

> Hi,
> I read HDFS architecture doc and it said HDFS is tuned for at storing
> large file, typically gigabyte to terabytes.What is the downsize of storing
> million of small files like <10MB?  or what setting of HDFS is suitable for
> storing small files?
> Actually, I plan to find a distribute filed system for storing mult
> million of files.
> Brendan

View raw message