chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiaqi Tan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-306) Standalone (non-daemon) Chukwa operation
Date Thu, 18 Jun 2009 20:52:07 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721480#action_12721480
] 

Jiaqi Tan commented on CHUKWA-306:
----------------------------------

Sounds good, but will there be port issues if we have multiple users trying to run Chukwa
on the same host? e.g. on a shared gateway machine of a shared Hadoop cluster. My main target
audience for a first "public trial" will be having M45 users (CMU or otherwise, since M45
is being opened up) be able to run this, and there are only a handful of gateway machines.
Can the ant process also randomize ports and automagically fill in the config files with the
port numbers?

> Standalone (non-daemon) Chukwa operation
> ----------------------------------------
>
>                 Key: CHUKWA-306
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-306
>             Project: Hadoop Chukwa
>          Issue Type: Wish
>            Reporter: Jiaqi Tan
>            Priority: Critical
>
> This is an articulation of a possible alternative use of Chukwa as a standalone log analysis
pipeline. This would enable users to read in existing logs from files, process (Demux) and
perform analysis (e.g. current SALSA/Mochi toolchain) on them, and visualize them, without
requiring the user to setup or run any daemons, nor database servers. 
> This can be presented as an alternative interface to Chukwa for the user, where the main
architectural parts (Chunks, post-Demux SequenceFiles of ChukwaRecords, post-Demux-processing
SequenceFiles of ChukwaRecords, and finally time-aggregated database entries for fast visualization)
remain unchanged, and Chukwa is manifest as a set of files in HDFS. The main value that Chukwa
then provides to users is 1. centralized one-stop-shop for log processing+analysis+anomaly
detection, 2. the ability to use MapReduce to process logs, regardless of whether they had
used Chukwa to collect the logs. 
> That way, the ability to process logs and analyze/do diagnosis is not tied to having
to run the entire Chukwa daemon infrastructure, since many users who use Hadoop clusters may
not have superuser access to those machines, e.g. users at universities using shared clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message