Return-Path: X-Original-To: apmail-storm-user-archive@minotaur.apache.org Delivered-To: apmail-storm-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1250610729 for ; Sat, 11 Jan 2014 13:57:38 +0000 (UTC) Received: (qmail 23993 invoked by uid 500); 11 Jan 2014 13:50:32 -0000 Delivered-To: apmail-storm-user-archive@storm.apache.org Received: (qmail 23366 invoked by uid 500); 11 Jan 2014 13:48:06 -0000 Mailing-List: contact user-help@storm.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@storm.incubator.apache.org Delivered-To: mailing list user@storm.incubator.apache.org Received: (qmail 23018 invoked by uid 99); 11 Jan 2014 13:46:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Jan 2014 13:46:33 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jh.li.em@gmail.com designates 209.85.223.178 as permitted sender) Received: from [209.85.223.178] (HELO mail-ie0-f178.google.com) (209.85.223.178) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Jan 2014 13:46:27 +0000 Received: by mail-ie0-f178.google.com with SMTP id lx4so6383465iec.37 for ; Sat, 11 Jan 2014 05:46:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=jXHlxHdWU5nze96drqYtDwqrprlUktHNezKgNAn4Lx8=; b=zxo5SuW44zGNqLCg0Eu5/8NgBlFlpkqWYszMLzIOML2Hr1/ERSOwp73pDbSVP62ZhA Ijhc0Dp03seDG5ZTyEbQ+7id5/f4JwepTNFs9yNGpeyY6NSyXhYlkq6gGlCnQ4DpJFxu apIJEexv3y7VEI3ZXovi0Cm5giW+0H4iGo44OOH3mjO8VEEAeSiP4VK8RErRsgTx3aY1 IOghc9OH/31WKhRIkEBr84oe1AeSY6Opd7F1VSIpJ4mIqeDSWRx7VTmZ7eBL7HM0taiG 4BxLKXZyQRM4qYOLoJhveCOkukXQeeLJzRh8bBhRt4rV5Z0OXXq/YapmAtROOX4WxWV7 GCeg== MIME-Version: 1.0 X-Received: by 10.43.74.198 with SMTP id yx6mr12878429icb.40.1389447966950; Sat, 11 Jan 2014 05:46:06 -0800 (PST) Received: by 10.64.246.231 with HTTP; Sat, 11 Jan 2014 05:46:06 -0800 (PST) In-Reply-To: References: Date: Sat, 11 Jan 2014 21:46:06 +0800 Message-ID: Subject: Re: Large binary payloads with storm From: =?GB2312?B?wO680rrq?= To: user Content-Type: multipart/alternative; boundary=001a11c3923e81096204efb21088 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c3923e81096204efb21088 Content-Type: text/plain; charset=ISO-8859-1 There is no need to serialize binary data, just send it as it. As by defalut storm-0.9.0 use kryo serializer to serialize tuple values, I guess we can skip this serialization step. Regards 2014/1/10 Jon Logan > You're going to run into issues if you have large tuples, because they are > buffered in memory. I would suggest moving it to an exterior channel, like > Redis, etc, and only passing meta-data through Storm. > > Your other solution is to use quirky things like reflection to prevent > your application from running out of memory when tuples are buffered. > > > On Fri, Jan 10, 2014 at 8:49 AM, Ruhollah Farchtchi < > ruhollah.farchtchi@gmail.com> wrote: > >> I am using storm to process small (< 100k) image files. I don't have a >> real-time requirement as yet, but my bottle neck is more in the image >> processing than message passing between bolts. I am using the Clojure DSL >> and the python bolt. Everything I've put together right now is very much a >> prototype so my next steps are some further processing and integration. >> Passing byte arrays didn't seem to work so well so I have had to >> encode/decode into base64 binary as it seems the JSON parsers on the python >> side didn't like byte arrays. I plan to go back and perhaps re-do the >> integration with a native C++ bolt, however I believe that there are other >> ways to do this integration as well. I'm As with Wilson, I'm interested if >> anyone else is using Storm to process binary payloads and what they have >> found works. >> >> Thanks, >> >> Ruhollah >> >> Ruhollah Farchtchi >> ruhollah.farchtchi@gmail.com >> >> >> On Thu, Jan 9, 2014 at 10:24 PM, Lochlainn Wilson < >> lochlainn.wilson@gmail.com> wrote: >> >>> Hi all, >>> >>> I am new to Storm and have been tasked with determining whether it is >>> feasible for us to use Apache storm in my company. I have of course >>> configured the sample projects and have been poking around. A red flag is >>> raised with the "stream processing" style JSON parsing. >>> >>> I am considering using storm with real time image processing bolts in >>> C++. Packaging binary data into a JSON (by escaping it) looks like it will >>> be slow and expensive. Is there a better way? Does anyone have experience >>> processing large streams of binary data through storm? >>> >>> How did it go? >>> >>> Regards, >>> >>> Lochlainn >>> >> >> > -- ====================================================== Gvain Email: jh.li.em@gmail.com --001a11c3923e81096204efb21088 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
There is no need to serialize binary data, just send it as= it.=A0
As by defalut storm-0.9.0 use kryo serializer to serialize tupl= e values, I guess we can skip this serialization step.

=
Regards =A0



2014/1/10 Jon Logan <jmlogan@buffalo.edu&g= t;
You're going to run int= o issues if you have large tuples, because they are buffered in memory. I w= ould suggest moving it to an exterior channel, like Redis, etc, and only pa= ssing meta-data through Storm.

Your other solution is to use quirky things like reflection = to prevent your application from running out of memory when tuples are buff= ered.


On Fri, Jan 10, 2014 at 8:49 AM, Ruhollah Farchtchi <<= a href=3D"mailto:ruhollah.farchtchi@gmail.com" target=3D"_blank">ruhollah.f= archtchi@gmail.com> wrote:
I am using storm to process small (< 100k) image files.= I don't have a real-time requirement as yet, but my bottle neck is mor= e in the image processing than message passing between bolts. I am using th= e Clojure DSL and the python bolt. Everything I've put together right n= ow is very much a prototype so my next steps are some further processing an= d integration. Passing byte arrays didn't seem to work so well so I hav= e had to encode/decode into base64 binary as it seems the JSON parsers on t= he python side didn't like byte arrays. I plan to go back and perhaps r= e-do the integration with a native C++ bolt, however I believe that there a= re other ways to do this integration as well. I'm As with Wilson, I'= ;m interested if anyone else is using Storm to process binary payloads and = what they have found works.

Thanks,

Ruhollah

Ru= hollah Farchtchi
ruhollah.farchtchi@gmail.com


On Thu, Jan 9, 2014 at 10:24 PM, Lochlai= nn Wilson <lochlainn.wilson@gmail.com> wrote:
Hi all,

I am new to Storm and have been t= asked with=20 determining whether it is feasible for us to use Apache storm in my=20 company. I have of course configured the sample projects and have been=20 poking around. A red flag is raised with the "stream processing" = style=20 JSON parsing.

I am considering using storm with real time image processing bolts in C= ++. Packaging binary data into a JSON (by escaping it) looks like it will b= e slow and expensive. Is there a better way? Does anyone have experience pr= ocessing large streams of binary data through storm?

How did it go?

Regards,

Lochlainn





--
=

=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

Gvain

Email: jh.li.em@gmail.com

--001a11c3923e81096204efb21088--