incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Storage question
Date Wed, 06 Mar 2013 06:57:56 GMT
Check out the aforementioned astyanax and this http://www.datastax.com/dev/blog/cassandra-file-system-design

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/03/2013, at 1:38 PM, "Hiller, Dean" <Dean.Hiller@nrel.gov> wrote:

> Thanks for the great explanation.
> 
> Dean
> 
> On 3/4/13 1:44 PM, "Kanwar Sangha" <kanwar@mavenir.com> wrote:
> 
>> Problems with small files and HDFS
>> 
>> A small file is one which is significantly smaller than the HDFS block
>> size (default 64MB). If you're storing small files, then you probably
>> have lots of them (otherwise you wouldn't turn to Hadoop), and the
>> problem is that HDFS can't handle lots of files.
>> 
>> Every file, directory and block in HDFS is represented as an object in
>> the namenode's memory, each of which occupies 150 bytes, as a rule of
>> thumb. So 10 million files, each using a block, would use about 3
>> gigabytes of memory. Scaling up much beyond this level is a problem with
>> current hardware. Certainly a billion files is not feasible.
>> 
>> Furthermore, HDFS is not geared up to efficiently accessing small files:
>> it is primarily designed for streaming access of large files. Reading
>> through small files normally causes lots of seeks and lots of hopping
>> from datanode to datanode to retrieve each small file, all of which is an
>> inefficient data access pattern.
>> Problems with small files and MapReduce
>> 
>> Map tasks usually process a block of input at a time (using the default
>> FileInputFormat). If the file is very small and there are a lot of them,
>> then each map task processes very little input, and there are a lot more
>> map tasks, each of which imposes extra bookkeeping overhead. Compare a
>> 1GB file broken into 16 64MB blocks, and 10,000 or so 100KB files. The
>> 10,000 files use one map each, and the job time can be tens or hundreds
>> of times slower than the equivalent one with a single input file.
>> 
>> There are a couple of features to help alleviate the bookkeeping
>> overhead: task JVM reuse for running multiple map tasks in one JVM,
>> thereby avoiding some JVM startup overhead (see the
>> mapred.job.reuse.jvm.num.tasks property), and MultiFileInputSplit which
>> can run more than one split per map.
>> 
>> -----Original Message-----
>> From: Hiller, Dean [mailto:Dean.Hiller@nrel.gov]
>> Sent: 04 March 2013 13:38
>> To: user@cassandra.apache.org
>> Subject: Re: Storage question
>> 
>> Well, astyanax I know can simulate streaming into cassandra and disperses
>> the file to multiple rows in the cluster so you could check that out.
>> 
>> Out of curiosity, why is HDFS not good for a small file size?  For
>> reading, it should be the bomb with RF=3 since you can read from multiple
>> nodes and such.  Writes might be a little slower but still shouldn't be
>> too bad.
>> 
>> Later,
>> Dean
>> 
>> From: Kanwar Sangha <kanwar@mavenir.com<mailto:kanwar@mavenir.com>>
>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
>> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>> Date: Monday, March 4, 2013 12:34 PM
>> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
>> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>> Subject: Storage question
>> 
>> Hi - Can someone suggest the optimal way to store files / images ? We are
>> planning to use cassandra for meta-data for these files.  HDFS is not
>> good for small file size .. can we look at something else ?
> 


Mime
View raw message