hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zachary Kozick <z...@omniar.com>
Subject Hadoop / HDFS equalivant but for realtime request handling / small files?
Date Tue, 01 Feb 2011 21:21:04 GMT
Hi all,

I'm interested in creating a solution that leverages multiple computing
nodes in an EC2 or Rackspace cloud environment in order to
do massively parallelized processing in the context of serving HTTP
requests, meaning I want results to be aggregated within 1-4 seconds.

>From what I gather, Hadoop is designed for job-oriented tasks and the
minimum job completion time is 30 seconds.  Also HDFS is meant for storing
few large files, as opposed to many small files.

My question is there a framework similar to hadoop that is designed more for
on-demand parallel computing?  What about a technology similar to HDFS that
is better at moving around small files and making them available to slave
nodes on demand?

View raw message