chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "shreyas subramanya (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CHUKWA-707) Replace Chukwa collector with Apache Kafka
Date Tue, 22 Jul 2014 23:58:38 GMT

     [ https://issues.apache.org/jira/browse/CHUKWA-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

shreyas subramanya updated CHUKWA-707:
--------------------------------------

    Attachment: CHUKWA-707.patch1

I have created the first patch of Kafka integration and uploaded it for review. It currently
makes use of Kafka as a replacement for the in-memory chunk queue in Chukwa. The flow is as
follows: 
Adaptor -> KafkaQueue -> KafkaBroker
KafkaConnector -> multiple KafkaConsumer threads -> PipelineWriters
(each KafkaConsumer sets up a pipeline)

The following configurations are needed:
 conf/chukwa-agent-conf.xml 
  -> chukwaAgent.chunk.queue = org.apache.hadoop.chukwa.datacollection.agent.KafkaQueue
(this sets up the kafka producer)
  -> chukwa.agent.connector = org.apache.hadoop.chukwa.datacollection.connector.kafka.KafkaConnector
(this sets up the kafka consumer)
 conf/consumer.properties
 conf/producer.properties

Each data type will be a new topic on kafka.

I am working on the improving the following areas:
1. Partitioning the topics so that we can have parallelism in a consumer group
2. Making the key format configurable

> Replace Chukwa collector with Apache Kafka
> ------------------------------------------
>
>                 Key: CHUKWA-707
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-707
>             Project: Chukwa
>          Issue Type: New Feature
>            Reporter: Eric Yang
>            Assignee: shreyas subramanya
>         Attachments: CHUKWA-707.patch1
>
>
> Chukwa collector has stopped evolving since 2010.  Newer framework has offer better features
of message queues, and Apache Kafka looks like a good replacement for Chukwa collector.
> Chukwa agent can implement a connector to Apache Kafka to replace Chukwa collector, and
HBase consumer to write data to HBase.  HICC REST API change to new HBase storage format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message