chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiaqi Tan (JIRA)" <>
Subject [jira] Created: (CHUKWA-306) Standalone (non-daemon) Chukwa operation
Date Wed, 17 Jun 2009 22:14:07 GMT
Standalone (non-daemon) Chukwa operation

                 Key: CHUKWA-306
             Project: Hadoop Chukwa
          Issue Type: Wish
            Reporter: Jiaqi Tan
            Priority: Critical

This is an articulation of a possible alternative use of Chukwa as a standalone log analysis
pipeline. This would enable users to read in existing logs from files, process (Demux) and
perform analysis (e.g. current SALSA/Mochi toolchain) on them, and visualize them, without
requiring the user to setup or run any daemons, nor database servers. 

This can be presented as an alternative interface to Chukwa for the user, where the main architectural
parts (Chunks, post-Demux SequenceFiles of ChukwaRecords, post-Demux-processing SequenceFiles
of ChukwaRecords, and finally time-aggregated database entries for fast visualization) remain
unchanged, and Chukwa is manifest as a set of files in HDFS. The main value that Chukwa then
provides to users is 1. centralized one-stop-shop for log processing+analysis+anomaly detection,
2. the ability to use MapReduce to process logs, regardless of whether they had used Chukwa
to collect the logs. 

That way, the ability to process logs and analyze/do diagnosis is not tied to having to run
the entire Chukwa daemon infrastructure, since many users who use Hadoop clusters may not
have superuser access to those machines, e.g. users at universities using shared clusters.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message