hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Holsman <had...@holsman.net>
Subject Re: How to create a lot files in HDFS quickly?
Date Mon, 30 May 2011 15:50:41 GMT
I don't know what your use case is, but you may want to investigate things like hbase or Cassandra
or voldemort if you need lots of small files 

Ian Holsman - 703 879-3128

I saw the angel in the marble and carved until I set him free -- Michelangelo

On 30/05/2011, at 12:54 AM, Konstantin Boudnik <cos@apache.org> wrote:

> Your best bet would be to take a look at synthetic load generator.
> 10^8 files would be a problem for most cases because you'd need to have a
> really beefy NN for that (~48GB of JVM heap and all that). The biggest I've
> heard about hold something at the order of 1.15*10^8 objects (files & dirs)
> and is serving a largest Hadoop cluster in the world for Yahoo! production
> setup. You might want to check YDN for more details about this case, I guess.
> Hope it helps,
>  Cos
> On Mon, May 30, 2011 at 10:44AM, ccxixicc wrote:
>>   Hi all
>>   I'm doing a test and need create lots of files ( 100 million ) in
>>   HDFS-L-NOT I use a shell script to do this , it's very very slow, how to
>>   create a lot files in HDFS quickly?
>>   Thanks

View raw message