hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerome Boulon <jbou...@yahoo-inc.com>
Subject Re: To Compute or Not to Compute on Prod
Date Fri, 31 Oct 2008 20:55:31 GMT
We have deployed a new monitoring system Chukwa (
http://wiki.apache.org/hadoop/Chukwa) that is doing exactly that.
Also this system provide an easy way to post-process you log file and
extract useful information using M/R.


On 10/31/08 1:46 PM, "Norbert Burger" <norbert.burger@gmail.com> wrote:

> What are you using to "stream logs into the HDFS"?
> If the command-line tools (ie., "hadoop dfs put") work for you, then all you
> need is a Hadoop install.  Your production node doesn't need to be a
> datanode.
> On Fri, Oct 31, 2008 at 2:35 PM, shahab mehmandoust <shahab53@gmail.com>wrote:
>> I want to stream data from logs into the HDFS in production but I do NOT
>> want my production machine to be apart of the computation cluster.  The
>> reason I want to do it in this way is to take advantage of HDFS without
>> putting computation load on my production machine.  Is this possible*?*
>> Furthermore, is this unnecessary because the computation would not put a
>> significant load on my production box (obviously depends on the map/reduce
>> implementation but I'm asking in general)*?*
>> I should note that our prod machine hosts our core web application and
>> database (saving up for another box :-).
>> Thanks,
>> Shahab

View raw message