hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stuart Smith <stu24m...@yahoo.com>
Subject Maximum number of files in directory? (in hdfs)
Date Wed, 18 Aug 2010 00:44:53 GMT
  I'm looking at storing a large number of files under one directory. 

I started to break the files into subdirectories out of habit (from working on ntfs/etc),
but it occurred to me that maybe (from a performance perspective), it doesn't really matter
on hdfs.

Does it? Is there some recommended limit on the number of files to store in one directory
on hdfs? I'm thinking thousands to millions, so we're not talking about INT_MAX or anything,
but a lot.

Or is it only limited by my sanity :) ?

I suppose it would come down to the data structure(s) used by the namenode when tracking file
metadata. But I don't know what those are - I did skim the HDFS architecture document, but
didn't see anything conclusive.

Take care,


View raw message