hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: Mounting HDFS as local file system
Date Thu, 02 Dec 2010 17:24:54 GMT

On Dec 2, 2010, at 9:22 AM, Mark Kerzner wrote:

> Brian,
> that almost answers my question. Still, are you saying that the problem of
> "Hadoop hates small files" does not exist?

Well, I'd say "hates" is too strong of a word.  Several of the "costs" (NN memory, latency,
efficiency) in HDFS are a function of the number of files, and one needs to plan appropriately.
 Some  users can accept this fact and work with it; other, less sophisticated, users simply
need to be told "don't save anything less than 10MB".

Over Thanksgiving holidays, we had a process go awry and write 900,000 files smaller than
a kilobyte into one XFS directory.  This was definitely costly in system resources and inefficient,
but I wouldn't say XFS hates small files.

So you need to keep file size costs in your planning.  If you ignore this variable, you will
likely be bitten by these issues.


View raw message