hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Jurney <russell.jur...@gmail.com>
Subject Re: Can Hadoop replace the use of MQ b/w processes?
Date Sun, 19 Aug 2012 17:27:30 GMT
The model with Hadoop would be to aggregate and write your events to
The Hadoop Distributed FileSystem, and then process them with
scheduled batch jobs via Hadoop MapReduce. If your requirements can
include some latency - then Hadoop can work for you. Depending on your
processing, you can schedule jobs down to say... every hour, half hour
or fifteen minutes? I'm not aware or anyone scheduling jobs more
frequently than that, but they may be. Chime in if you are.

For getting events to HDFS, look at Flume, Kafka and Scribe. For
processing events, look at Pig, HIVE and Cascading. For scheduling
jobs look at Oozie and Azkaban.

Russell Jurney http://datasyndrome.com

On Aug 19, 2012, at 9:47 AM, Robert Nicholson
<robert.nicholson@gmail.com> wrote:

> We have an application or a series of applications that listen to incoming feeds they
then distribute this data in XML form to a number of queues.  Another set of processes listen
to these queues and process the messages. Order of processing is important in so far as related
messages need to be processed in sequence hence today all related messages go to the same
queue and are processed by the same queue consumer.
> The idea would be replace the use of MQ with some kind of reliable distributed dispatch.
Does Hadoop provide that?

View raw message