storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Leung <ncle...@gmail.com>
Subject Re: Large binary payloads with storm
Date Sun, 12 Jan 2014 16:15:45 GMT
Muliti lang interface uses json which is a text format. Given an earlier
email (
http://mail-archives.apache.org/mod_mbox/storm-user/201401.mbox/%3CCAEN10JreBSFO-=xhNjbn9r+5+F+G=AZ8rW58qDo8x32Gd-xUkg@mail.gmail.com%3E)
the object appears to be serialized to json using toString which for byte
array yields [B@<reference> where the [B is type information specifying
byte array. Therefore you will have to encode to something like base64 that
can represent your binary data on a text file.
On Jan 12, 2014 10:49 AM, "Ruhollah Farchtchi" <ruhollah.farchtchi@gmail.com>
wrote:

> I am using 0.9. What I think is the issue is that storm.py is having
> problems when deserializing a byte array. When I encode as base64 binary
> string I have no problems and it deserializes fine. Of course I would like
> to avoid this extra overhead if possible. All my binary objects are
> relatively small 200-300k max.
>
> On Sunday, January 12, 2014, 李家宏 wrote:
>
>> hi , Farchtchi,
>>
>> which storm version are you using ?
>> IF the tuple is not serialized, then there is no need to use a JSON
>> parser to parse the received tuple. I guess so.
>>
>> Regards
>>
>>
>> 2014/1/11 Ruhollah Farchtchi <ruhollah.farchtchi@gmail.com>
>>
>> Yes I read that in the docs. However when receiving the byte array in
>> storm.py it throws a json error when trying to parse the tuples. I didn't
>> have time to look into it further as I am new to storm and python.
>>
>>
>> On Saturday, January 11, 2014, 李家宏 wrote:
>>
>> There is no need to serialize binary data, just send it as it.
>> As by defalut storm-0.9.0 use kryo serializer to serialize tuple values,
>> I guess we can skip this serialization step.
>>
>> Regards
>>
>>
>>
>> 2014/1/10 Jon Logan <jmlogan@buffalo.edu>
>>
>> You're going to run into issues if you have large tuples, because they
>> are buffered in memory. I would suggest moving it to an exterior channel,
>> like Redis, etc, and only passing meta-data through Storm.
>>
>> Your other solution is to use quirky things like reflection to prevent
>> your application from running out of memory when tuples are buffered.
>>
>>
>> On Fri, Jan 10, 2014 at 8:49 AM, Ruhollah Farchtchi <
>> ruhollah.farchtchi@gmail.com> wrote:
>>
>> I am using storm to process small (< 100k) image files. I don't have a
>> real-time requirement as yet, but my bottle neck is more in the image
>> processing than message passing between bolts. I am using the Clojure DSL
>> and the python bolt. Everything I've put together right now is very much a
>> prototype so my next steps are some further processing and integration.
>> Passing byte arrays didn't seem to work so well so I have had to
>> encode/decode into base64 binary as it seems the JSON parsers on the python
>> side didn't like byte arrays. I plan to go back and perhaps re-do the
>> integration with a native C++ bolt, however I believe that there are other
>> ways to do this integration as well. I'm As with Wilson, I'm interested if
>> anyone else is using Storm to process binary payloads and what they have
>> found works.
>>
>> Thanks,
>>
>> Ruhollah
>>
>> Ruhollah Farchtchi
>> ruhollah.farchtchi@gmail.com
>>
>>
>> On Thu, Jan 9, 2014 at 10:24 PM, Lochlainn Wilson <
>> lochlainn.wilson@gmail.com> wrote:
>>
>> Hi all,
>>
>> I am new to Storm and have been tasked with determining whether it is
>> feasible for us to use Apache storm in my company. I have of course
>> configured the sample projects and have been poking around. A red flag is
>> raised with the "stream processing" style JSON parsing.
>>
>> I am considering using storm with real time image processing bolts in
>> C++. Packaging binary data into a JSON (by escaping it) looks like it will
>> be slow and expensive. Is there a better way? Does anyone have experience
>> processing large streams of binary data through storm?
>>
>> How did it go?
>>
>> Regards,
>>
>> Lochlainn
>>
>>
>>
>>
>>
>>
>> --
>>
>> ======================================================
>>
>> Gvain
>>
>> Email: jh.li.em@gmail.com
>>
>>
>>
>> --
>> Ruhollah Farchtchi
>> ruhollah.farchtchi@gmail.com
>>
>>
>>
>>
>> --
>>
>> ======================================================
>>
>> Gvain
>>
>> Email: jh.li.em@gmail.com
>>
>
>
> --
> Ruhollah Farchtchi
> ruhollah.farchtchi@gmail.com
>

Mime
View raw message