incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kanwar Sangha <kan...@mavenir.com>
Subject RE: Storage question
Date Mon, 04 Mar 2013 20:44:50 GMT
Problems with small files and HDFS

A small file is one which is significantly smaller than the HDFS block size (default 64MB).
If you're storing small files, then you probably have lots of them (otherwise you wouldn't
turn to Hadoop), and the problem is that HDFS can't handle lots of files.

Every file, directory and block in HDFS is represented as an object in the namenode's memory,
each of which occupies 150 bytes, as a rule of thumb. So 10 million files, each using a block,
would use about 3 gigabytes of memory. Scaling up much beyond this level is a problem with
current hardware. Certainly a billion files is not feasible.

Furthermore, HDFS is not geared up to efficiently accessing small files: it is primarily designed
for streaming access of large files. Reading through small files normally causes lots of seeks
and lots of hopping from datanode to datanode to retrieve each small file, all of which is
an inefficient data access pattern.
Problems with small files and MapReduce

Map tasks usually process a block of input at a time (using the default FileInputFormat).
If the file is very small and there are a lot of them, then each map task processes very little
input, and there are a lot more map tasks, each of which imposes extra bookkeeping overhead.
Compare a 1GB file broken into 16 64MB blocks, and 10,000 or so 100KB files. The 10,000 files
use one map each, and the job time can be tens or hundreds of times slower than the equivalent
one with a single input file.

There are a couple of features to help alleviate the bookkeeping overhead: task JVM reuse
for running multiple map tasks in one JVM, thereby avoiding some JVM startup overhead (see
the mapred.job.reuse.jvm.num.tasks property), and MultiFileInputSplit which can run more than
one split per map.

-----Original Message-----
From: Hiller, Dean [mailto:Dean.Hiller@nrel.gov] 
Sent: 04 March 2013 13:38
To: user@cassandra.apache.org
Subject: Re: Storage question

Well, astyanax I know can simulate streaming into cassandra and disperses the file to multiple
rows in the cluster so you could check that out.

Out of curiosity, why is HDFS not good for a small file size?  For reading, it should be the
bomb with RF=3 since you can read from multiple nodes and such.  Writes might be a little
slower but still shouldn't be too bad.

Later,
Dean

From: Kanwar Sangha <kanwar@mavenir.com<mailto:kanwar@mavenir.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Monday, March 4, 2013 12:34 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Storage question

Hi - Can someone suggest the optimal way to store files / images ? We are planning to use
cassandra for meta-data for these files.  HDFS is not good for small file size .. can we look
at something else ?

Mime
View raw message