storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruhollah Farchtchi <>
Subject Re: questions about multilang bolt's STDIN&STDOUT
Date Wed, 08 Jan 2014 19:57:28 GMT
I have had to do this for image data and per Antonio’s suggestion I am encoding and decoding
my byte-array into base64. I’m using the clojure DSL and I’ve found it to be fairly performant
(we have more optimizing on our image processing side to do). 

Ruhollah Farchtchi

On Jan 8, 2014, at 1:55 PM, Antonio Verardi <> wrote:

> Hi,
> I am extensively using the multilang interface for Python. JSON is the way you serialize
things for communication. It adds a fairly amount of overhead, but it is a reasonable design
choice in terms of a multilang interface.
> If your question is: can I read byte array messages from a bolt (made up by command,
id, stream, task and tuple), the answer is "that's not that easy, you should implement something
in order to do that".
> If your question is: can I serialize byte arrays in JSON with Python and use them as
"values" for the field "tuple", the answer is: "yes, even though JSON always produce string
objects". []. You may want to modify, in order to do that, or simply encode and decode your data within your own bolt,
it depends on your needs. 
> This is something I found just googling about encoding binary data in JSON:
> I hope it was what you were looking for,
> Antonio Uccio Verardi
> On Tue, Jan 7, 2014 at 11:24 PM, churly lin <> wrote:
> Hi all,
> I am trying to write a topology with a KafkaSpout and a ShellBolt(implemented by python
> According to the Multilang-protocol, multilang uses json messages over stdin/stdout to
communicate with the subprocess. Specially, both ends of this protocol use a line-reading
mechanism. Does it mean that, in multilang, we could not emit message as byte array? If not,
how to read a byte array tuple in a python bolt ?
> the json which was read by python bolt is look like:
> {
>         "command": "emit",
>         // The id for the tuple. Leave this out for an unreliable emit. The id can
>     // be a string or a number.
>         "id": "1231231",
>         // The id of the stream this tuple was emitted to. Leave this empty to emit to
default stream.
>         "stream": "1",
>         // If doing an emit direct, indicate the task to send the tuple to
>         "task": 9,
>         // All the values in this tuple
>         "tuple": ["field1", 2, 3]}
> This example shows that, the "tuple" can be String("field1") and number(2, 3). Could
it be a byte array?

View raw message