hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sujit Pal <sujit....@comcast.net>
Subject Re: HDFS or Cassandra?
Date Fri, 16 Oct 2009 22:20:17 GMT
Sorry, HDFS should have been HBase.


On Fri, 2009-10-16 at 14:36 -0700, Sujit Pal wrote:
> Hi,
> I have a situation where I need to "collect" data into some sort of
> common medium from a set of mapreduce jobs, then have another mapreduce
> job "consolidate" these to provide the final result. I was considering
> using some sort of database to store the output of the first stage and
> then read them (I need to be able to do random access on the keys) in
> the second stage.
> I thought of using HDFS and a colleague suggested Apache Cassandra. Both
> seem to be implementations of BigTable. I read that HDFS is a file
> handle hog, but no such thing on the Cassandra site. Would it be
> preferable, in your opinion, to use one over the other? I suppose I
> should just try them both, but if someone has done this already, would
> appreciate their input before doing this.
> Thanks
> Sujit

View raw message