incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <>
Subject Re: Storage question
Date Mon, 04 Mar 2013 21:38:00 GMT
Thanks for the great explanation.


On 3/4/13 1:44 PM, "Kanwar Sangha" <> wrote:

>Problems with small files and HDFS
>A small file is one which is significantly smaller than the HDFS block
>size (default 64MB). If you're storing small files, then you probably
>have lots of them (otherwise you wouldn't turn to Hadoop), and the
>problem is that HDFS can't handle lots of files.
>Every file, directory and block in HDFS is represented as an object in
>the namenode's memory, each of which occupies 150 bytes, as a rule of
>thumb. So 10 million files, each using a block, would use about 3
>gigabytes of memory. Scaling up much beyond this level is a problem with
>current hardware. Certainly a billion files is not feasible.
>Furthermore, HDFS is not geared up to efficiently accessing small files:
>it is primarily designed for streaming access of large files. Reading
>through small files normally causes lots of seeks and lots of hopping
>from datanode to datanode to retrieve each small file, all of which is an
>inefficient data access pattern.
>Problems with small files and MapReduce
>Map tasks usually process a block of input at a time (using the default
>FileInputFormat). If the file is very small and there are a lot of them,
>then each map task processes very little input, and there are a lot more
>map tasks, each of which imposes extra bookkeeping overhead. Compare a
>1GB file broken into 16 64MB blocks, and 10,000 or so 100KB files. The
>10,000 files use one map each, and the job time can be tens or hundreds
>of times slower than the equivalent one with a single input file.
>There are a couple of features to help alleviate the bookkeeping
>overhead: task JVM reuse for running multiple map tasks in one JVM,
>thereby avoiding some JVM startup overhead (see the
>mapred.job.reuse.jvm.num.tasks property), and MultiFileInputSplit which
>can run more than one split per map.
>-----Original Message-----
>From: Hiller, Dean []
>Sent: 04 March 2013 13:38
>Subject: Re: Storage question
>Well, astyanax I know can simulate streaming into cassandra and disperses
>the file to multiple rows in the cluster so you could check that out.
>Out of curiosity, why is HDFS not good for a small file size?  For
>reading, it should be the bomb with RF=3 since you can read from multiple
>nodes and such.  Writes might be a little slower but still shouldn't be
>too bad.
>From: Kanwar Sangha <<>>
>Reply-To: "<>"
>Date: Monday, March 4, 2013 12:34 PM
>To: "<>"
>Subject: Storage question
>Hi - Can someone suggest the optimal way to store files / images ? We are
>planning to use cassandra for meta-data for these files.  HDFS is not
>good for small file size .. can we look at something else ?

View raw message