chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <>
Subject [jira] [Commented] (CHUKWA-678) Make use of ChukwaWriter in agent
Date Sat, 05 Jan 2013 18:42:12 GMT


Eric Yang commented on CHUKWA-678:

List of sending events should be a ring buffer.  If the sending action fails multiple retries,
then it should discard that data.  The error entry can be logged for failure by recursively
inject into the buffer ring or logged locally.  For handling pipeline failure, our general
rule of thumb is to throw exception back to agent when one of the write failed to commit.
 If one or more of the writers have failed in the writing action, we throw exception.  There
chunk will be retried, and this means multiple data sink can receive duplicated data.  We
have the unique sequence number in our meta data, therefore the de-dupe can happen synchronously
(in writer) or asynchronously (off band process in map reduce).  We provide a single result
of the commit status from pipeline writer, instead of sending List of results back to agent.
 This will make sure retries and de-dupe logic can implemented correctly.
> Make use of ChukwaWriter in agent
> ---------------------------------
>                 Key: CHUKWA-678
>                 URL:
>             Project: Chukwa
>          Issue Type: Sub-task
>          Components: Data Collection
>         Environment: MacOSX, Java 6
>            Reporter: shreyas subramanya
> The chukwa agent sends out data chunks to various destinations through the combination
of Connector and ChukwaSender interfaces. For sending chunks to collector, we have http implementation
of these interfaces. The collector writes out the received chunks to various destinations
through classes implementing ChukwaWriter interface. Optionally, multiple destinations can
be chosen by specifying PipelineStageWriter.
> The proposal is to:
> 1. Use ChukwaWriter to send out data chunks to multiple destinations from the agent.
Further, PipelinestageWriter can be made default and pipeline configuration specified in the
agent config file
> 2. Implement (or modify) Pipelineable writers for HBase, Http, Hdfs and WebHdfs
> 3. Do away with the Connector interface and have a single non configurable connector
object as part of the agent. This class initiates the configured writer, waits for data chunks
and passes the chunks to Writer.add()/send(). Connection protocol for each destination is
handled by the init() of the individual writers.
> Considerations:
> 1. In case of Pipelineable writers, we need a way to merge the results of each pipeline
stage before the agent commits the chunk.
> 2. Handling pipeline failure

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message