kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Miller <st...@idrathernotsay.com>
Subject Using the kafka dissector in wireshark/tshark 1.12
Date Tue, 12 Aug 2014 18:03:58 GMT
   I'd seen references to there being a Kafka protocol dissector built into wireshark/tshark
1.12, but what I could find on that was a bit light on the specifics as to how to get it to
do anything -- at least for someone (like me) who might use tcpdump a lot but who doesn't
use tshark a lot.

   I got this working, so I figured I'd post a few pointers here on the off-chance that they
save someone else a bit of time.

   Note that I'm using tshark, not wireshark; this might be easier and/or different in wireshark,
but I don't feel like moving many gigabytes of data to a place where I can use wireshark.

   If you're reading traffic live, you'll want to do something like this:

	tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -f 'dst port 9092' -Y (kafka

   For example, if you want to see output only for ProduceRequest and ProduceResponses, and
only for the topic "mytopic", you can do:

	tshark -V -i eth1 -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -f 'dst port 9092' -Y 'kafka.topic_name==mytopic
&& kafka.request_key==0'

   You can get a complete list of Kafka-related fields by doing:

	tshark -G fields | grep -i kafka

   There is a very significant downside to processing packets live: tshark uses dumpcap to
generate the actual packets, and unless I'm missing some obscure tshark option (which is possible!)
it won't toss old data.  So if you run this for a few hours, you'll end up with a ginormous

   By default (under Linux, at least) tshark is going to put that file in /tmp, so if your
/tmp is small and/or a tmpfs that can make things a little exciting.  You can get around that
by doing:

	(export TMPDIR=/big/damn/filesystem ; tshark bla bla bla)

which I figure given typical Kafka data volumes is probably pretty important to know, and
which doesn't seem to be documented in the tshark man pages.  It is at least not all that
hard to search for.

   In theory, you can use the tshark "-b" option to specify a ring buffer of files, even for
real-time processing, though:

	* adding -b anything (e.g., "-b files:1 -b filesize:1024") seems to want to force you to
use -w (filename)

	* just adding -b and -w to the invocation above gets a warning about display filters not
being supported when capturing and saving packets

	* changing -Y to -2 -R and/or adding -P doesn't seem to help

(though again someone with more tshark experience might know the magic combination of arguments
to get this to do what it's told).

   So instead, you can capture packets somewhere, e.g.:

	tcpdump -n -s 0 -w /var/tmp/kafka.tcpd -i eth1 'port 9092'

and then decode them later:

	tshark -V -r /var/tmp/kafka.tcpd -o 'kafka.tcp.port:9092' -d tcp.port=9092,kafka -R 'kafka.topic_name==mytopic
&& kafka.request_key==0' -2

   Anyway, if you're seeing protocol-related weirdness, hopefully this will be at least of
some help to you.

	(Yes, the email address is a joke.  Just not on you!  It does work.)

View raw message