flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason J. W. Williams" <jasonjwwilli...@gmail.com>
Subject Kafka Sink random partition assignment
Date Mon, 06 Jun 2016 23:32:29 GMT
Hi,

New to flume and I'm trying to relay log messages received over netcat
source to Kafka sink.

Everything seems to be fine, except that Flume is acting like it IS
assigning a partition key to the produced messages though none is assigned.
I'd like the messages to be assigned to a random partition, so that
consumers are load balanced.

* Flume 1.6.0
* Kafka 0.9.0.1

Flume config:
https://gist.github.com/williamsjj/8ae025906955fbc4b5f990e162b75d7c

Kafka topic config: kafka-topics --zookeeper localhost/kafka --create
--topic activity.history --partitions 20 --replication-factor 1

Python consumer program:
https://gist.github.com/williamsjj/9e67287f0154816c3a733a39ad008437

Test program (publishes to Flume):
https://gist.github.com/williamsjj/1eb097a187a3edb17ec1a3913e47e58b

Flume agent listens on 3132tcp for connections, and publishes messages
received to the Kafka activity.history topic.  I'm running two instances of
the Python consumer.

What happens however, is all logs messages get sent to a single Kafka
consumer...if I restart Flume (leave consumers running) and re-run the
test, all messages get published to the other consumer. So it feels like
Flume is assigning a permanent partition key even though one is not defined
(and should therefore be random).

Any advice is greatly appreciated.

-J

Mime
View raw message