hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Kozlov <ale...@cloudera.com>
Subject Re: HDFS without Consideration for Map and Reduce
Date Tue, 06 Jul 2010 21:03:20 GMT
Hi Ananth,

A general approach to do this in HDFS is Sequence Files or Hadoop Archives.
In a layman terms, you just pack a few files into a larger file and you can
develop your own logic on top of this.  Having said that you will probably
have to pay a penalty for random access: HDFS was not designed for this.
However, there are other solutions on top of Hadoop like HBase to do this
(among many others).

I know this is very concise, but let me know you business case and I can go
into more details.


Alex K

On Tue, Jul 6, 2010 at 1:51 PM, Ananth Sarathy

> Yea I know I can use a nas or San. I am not really asking about this as a
> use case on what the best way way to do it is but rather what the best way
> to do use hdfs is it was decided that hdfs WAS the fileasystem you were
> going to use to serve lots of small files.
> sent from my nexus one
> On Jul 6, 2010 3:43 PM, "Patrick Angeles" <patrick@cloudera.com> wrote:
> If all you want is dumb storage for small-ish files, you can always just
> use
> NAS or SAN.
> For the MP3 example, you might want to consider HBase... you can store
> associated meta-data in column families.
> On Tue, Jul 6, 2010 at 3:33 PM, Ananth Sarathy
> <ananth.t.sarathy@gmail.com>wrote:
> > So I am aware of the problem with small files
> > and I have read this article
> >
> > http://www.cloud...

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message