I am using 0.9. What I think is the issue is that storm.py is having problems when deserializing a byte array. When I encode as base64 binary string I have no problems and it deserializes fine. Of course I would like to avoid this extra overhead if possible. All my binary objects are relatively small 200-300k max.
On Sunday, January 12, 2014, §õ®a§» wrote:
hi , Farchtchi,which storm version are you using ?IF the tuple is not serialized, then there is no need to use a JSON parser to parse the received tuple. I guess so.Regards2014/1/11 Ruhollah Farchtchi <email@example.com>
Yes I read that in the docs. However when receiving the byte array in storm.py it throws a json error when trying to parse the tuples. I didn't have time to look into it further as I am new to storm and python.--
On Saturday, January 11, 2014, §õ®a§» wrote:There is no need to serialize binary data, just send it as it.As by defalut storm-0.9.0 use kryo serializer to serialize tuple values, I guess we can skip this serialization step.Regards2014/1/10 Jon Logan <firstname.lastname@example.org>
You're going to run into issues if you have large tuples, because they are buffered in memory. I would suggest moving it to an exterior channel, like Redis, etc, and only passing meta-data through Storm.Your other solution is to use quirky things like reflection to prevent your application from running out of memory when tuples are buffered.On Fri, Jan 10, 2014 at 8:49 AM, Ruhollah Farchtchi <email@example.com> wrote:
I am using storm to process small (< 100k) image files. I don't have a real-time requirement as yet, but my bottle neck is more in the image processing than message passing between bolts. I am using the Clojure DSL and the python bolt. Everything I've put together right now is very much a prototype so my next steps are some further processing and integration. Passing byte arrays didn't seem to work so well so I have had to encode/decode into base64 binary as it seems the JSON parsers on the python side didn't like byte arrays. I plan to go back and perhaps re-do the integration with a native C++ bolt, however I believe that there are other ways to do this integration as well. I'm As with Wilson, I'm interested if anyone else is using Storm to process binary payloads and what they have found works.Thanks,RuhollahRuhollah Farchtchi
firstname.lastname@example.orgOn Thu, Jan 9, 2014 at 10:24 PM, Lochlainn Wilson <email@example.com> wrote:
Regards,Hi all,How did it go?
I am new to Storm and have been tasked with determining whether it is feasible for us to use Apache storm in my company. I have of course configured the sample projects and have been poking around. A red flag is raised with the "stream processing" style JSON parsing.
I am considering using storm with real time image processing bolts in C++. Packaging binary data into a JSON (by escaping it) looks like it will be slow and expensive. Is there a better way? Does anyone have experience processing large streams of binary data through storm?