hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Rabkin <asrab...@gmail.com>
Subject Re: HDFS as a logfile ??
Date Mon, 13 Apr 2009 14:37:39 GMT
Chukwa is a Hadoop subproject aiming to do something similar, though
particularly for the case of Hadoop logs.  You may find it useful.

Hadoop unfortunately does not support concurrent appends.  As a
result, the Chukwa project found itself creating a whole new demon,
the chukwa collector, precisely to merge the event streams and write
it out, just once. We're set to do a release within the next week or
two, but in the meantime you can check it out from SVN at


On Fri, Apr 10, 2009 at 12:06 AM, Ricky Ho <rho@adobe.com> wrote:
> I want to analyze the traffic pattern and statistics of a distributed application.  I
am thinking of having the application write the events as log entries into HDFS and then later
I can use a Map/Reduce task to do the analysis in parallel.  Is this a good approach ?
> In this case, does HDFS support concurrent write (append) to a file ?  Another question
is whether the write API thread-safe ?
> Rgds,
> Ricky

Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department

View raw message