Return-Path: X-Original-To: apmail-storm-user-archive@minotaur.apache.org Delivered-To: apmail-storm-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 972F610169 for ; Fri, 10 Jan 2014 15:21:09 +0000 (UTC) Received: (qmail 28715 invoked by uid 500); 10 Jan 2014 15:20:04 -0000 Delivered-To: apmail-storm-user-archive@storm.apache.org Received: (qmail 28645 invoked by uid 500); 10 Jan 2014 15:20:02 -0000 Mailing-List: contact user-help@storm.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@storm.incubator.apache.org Delivered-To: mailing list user@storm.incubator.apache.org Received: (qmail 28629 invoked by uid 99); 10 Jan 2014 15:20:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jan 2014 15:20:02 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [74.125.82.175] (HELO mail-we0-f175.google.com) (74.125.82.175) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jan 2014 15:19:55 +0000 Received: by mail-we0-f175.google.com with SMTP id w62so4234770wes.34 for ; Fri, 10 Jan 2014 07:19:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=8fI2zDhHfZpwL0mKklm/KMivtGQqFlzHH/uOne1X3+o=; b=iIxQr9tFFFifxps3VttcN5ReyWoqSh/aMpHgukeoAadpK+FqfXEVKFgQB/8WUubfhZ KKijNl81fFZ2bISEQHlvq0JPiky+gLQqSZ0qEm6DpDLez/TZIwxtWDVSWlgFwto2leN/ 1vE3COITbmqvcHmx0vz6U/ry6srf9KLkt/LSZCik374Jaiy/SxGKTr3BGTqblOq0Y754 JRGyuWssbCyak2AaDjE4tHwsfLnZafT+B0Rpm1upTSI0vk5qzr25EOfDaelvdWRoGwoK kg0Zofox3uZaCrMyuxiyMLilTy0HolwmcmSuKw3Upg1azGeToR/gmnz56d6tTFqHs2dt FJQA== X-Gm-Message-State: ALoCoQlle4WnftIsmgOaBeKaiCqsCsl+tUXX9Q4vtykkBh/0ZB3F8dtt3wZMpkV4Ah+nBygGj9od MIME-Version: 1.0 X-Received: by 10.194.175.202 with SMTP id cc10mr9188329wjc.48.1389367173971; Fri, 10 Jan 2014 07:19:33 -0800 (PST) Received: by 10.227.103.74 with HTTP; Fri, 10 Jan 2014 07:19:33 -0800 (PST) In-Reply-To: References: Date: Fri, 10 Jan 2014 10:19:33 -0500 Message-ID: Subject: Re: Large binary payloads with storm From: Jon Logan To: user@storm.incubator.apache.org Content-Type: multipart/alternative; boundary=089e013d0f48de0a7b04ef9f405a X-Virus-Checked: Checked by ClamAV on apache.org --089e013d0f48de0a7b04ef9f405a Content-Type: text/plain; charset=ISO-8859-1 You're going to run into issues if you have large tuples, because they are buffered in memory. I would suggest moving it to an exterior channel, like Redis, etc, and only passing meta-data through Storm. Your other solution is to use quirky things like reflection to prevent your application from running out of memory when tuples are buffered. On Fri, Jan 10, 2014 at 8:49 AM, Ruhollah Farchtchi < ruhollah.farchtchi@gmail.com> wrote: > I am using storm to process small (< 100k) image files. I don't have a > real-time requirement as yet, but my bottle neck is more in the image > processing than message passing between bolts. I am using the Clojure DSL > and the python bolt. Everything I've put together right now is very much a > prototype so my next steps are some further processing and integration. > Passing byte arrays didn't seem to work so well so I have had to > encode/decode into base64 binary as it seems the JSON parsers on the python > side didn't like byte arrays. I plan to go back and perhaps re-do the > integration with a native C++ bolt, however I believe that there are other > ways to do this integration as well. I'm As with Wilson, I'm interested if > anyone else is using Storm to process binary payloads and what they have > found works. > > Thanks, > > Ruhollah > > Ruhollah Farchtchi > ruhollah.farchtchi@gmail.com > > > On Thu, Jan 9, 2014 at 10:24 PM, Lochlainn Wilson < > lochlainn.wilson@gmail.com> wrote: > >> Hi all, >> >> I am new to Storm and have been tasked with determining whether it is >> feasible for us to use Apache storm in my company. I have of course >> configured the sample projects and have been poking around. A red flag is >> raised with the "stream processing" style JSON parsing. >> >> I am considering using storm with real time image processing bolts in >> C++. Packaging binary data into a JSON (by escaping it) looks like it will >> be slow and expensive. Is there a better way? Does anyone have experience >> processing large streams of binary data through storm? >> >> How did it go? >> >> Regards, >> >> Lochlainn >> > > --089e013d0f48de0a7b04ef9f405a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
You're going to run into issues if you have large tupl= es, because they are buffered in memory. I would suggest moving it to an ex= terior channel, like Redis, etc, and only passing meta-data through Storm.<= div>
Your other solution is to use quirky things like reflection = to prevent your application from running out of memory when tuples are buff= ered.


On Fri, Jan 10, 2014 at 8:49 AM, Ruhollah Farchtchi <<= a href=3D"mailto:ruhollah.farchtchi@gmail.com" target=3D"_blank">ruhollah.f= archtchi@gmail.com> wrote:
I am using storm to process small (< 100k) image files.= I don't have a real-time requirement as yet, but my bottle neck is mor= e in the image processing than message passing between bolts. I am using th= e Clojure DSL and the python bolt. Everything I've put together right n= ow is very much a prototype so my next steps are some further processing an= d integration. Passing byte arrays didn't seem to work so well so I hav= e had to encode/decode into base64 binary as it seems the JSON parsers on t= he python side didn't like byte arrays. I plan to go back and perhaps r= e-do the integration with a native C++ bolt, however I believe that there a= re other ways to do this integration as well. I'm As with Wilson, I'= ;m interested if anyone else is using Storm to process binary payloads and = what they have found works.

Thanks,

Ruhollah



On Thu, Jan 9, 2014 at 10:24 PM, Lochlai= nn Wilson <lochlainn.wilson@gmail.com> wrote:
Hi all,

I am new to Storm and have been t= asked with=20 determining whether it is feasible for us to use Apache storm in my=20 company. I have of course configured the sample projects and have been=20 poking around. A red flag is raised with the "stream processing" = style=20 JSON parsing.

I am considering using storm with real time image processing bolts in C= ++. Packaging binary data into a JSON (by escaping it) looks like it will b= e slow and expensive. Is there a better way? Does anyone have experience pr= ocessing large streams of binary data through storm?

How did it go?

Regards,

Lochlainn


--089e013d0f48de0a7b04ef9f405a--