Return-Path: X-Original-To: apmail-storm-user-archive@minotaur.apache.org Delivered-To: apmail-storm-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4FB3D17AD9 for ; Tue, 10 Feb 2015 22:37:37 +0000 (UTC) Received: (qmail 4316 invoked by uid 500); 10 Feb 2015 22:37:36 -0000 Delivered-To: apmail-storm-user-archive@storm.apache.org Received: (qmail 4273 invoked by uid 500); 10 Feb 2015 22:37:36 -0000 Mailing-List: contact user-help@storm.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@storm.apache.org Delivered-To: mailing list user@storm.apache.org Received: (qmail 4263 invoked by uid 99); 10 Feb 2015 22:37:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Feb 2015 22:37:36 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: 207.46.100.71 is neither permitted nor denied by domain of eric.ruel@wantedanalytics.com) Received: from [207.46.100.71] (HELO na01-by2-obe.outbound.protection.outlook.com) (207.46.100.71) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Feb 2015 22:37:10 +0000 Received: from BN3PR0801MB1107.namprd08.prod.outlook.com (25.161.218.16) by BN3PR0801MB1105.namprd08.prod.outlook.com (25.161.218.155) with Microsoft SMTP Server (TLS) id 15.1.81.19; Tue, 10 Feb 2015 22:37:04 +0000 Received: from BN3PR0801MB1107.namprd08.prod.outlook.com ([25.161.218.16]) by BN3PR0801MB1107.namprd08.prod.outlook.com ([25.161.218.16]) with mapi id 15.01.0081.018; Tue, 10 Feb 2015 22:37:04 +0000 From: Eric Ruel To: "user@storm.apache.org" Subject: RE: Trident topology - can we deactivate batch ordering Thread-Topic: Trident topology - can we deactivate batch ordering Thread-Index: AQHQRYIXBTSVx6Pcqke9+g7RSirCbQ== Date: Tue, 10 Feb 2015 22:37:03 +0000 Message-ID: <1423607797732.94227@wantedanalytics.com> References: <1416498806486.50097@wantedanalytics.com> <30FCD163-B33C-4D90-AC17-AF2D51327F90@gmail.com> <1416510163813.19387@wantedanalytics.com> <1416580452194.27781@wantedanalytics.com>,<1717815CB93D38429CA16B85CA76D53E44839F@smtp_mail.bankofamerica.com>,<1416590176184.25348@wantedanalytics.com> In-Reply-To: <1416590176184.25348@wantedanalytics.com> Accept-Language: fr-CA, en-US Content-Language: fr-CA X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [204.194.21.2] authentication-results: storm.apache.org; dkim=none (message not signed) header.d=none; x-microsoft-antispam: BCL:0;PCL:0;RULEID:;SRVR:BN3PR0801MB1105; x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:;SRVR:BN3PR0801MB1105; x-forefront-prvs: 048396AFA0 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(24454002)(377454003)(22564002)(36756003)(92566002)(19617315012)(66066001)(107886001)(19627405001)(2351001)(110136001)(46102003)(93886004)(40100003)(551934003)(19580395003)(2656002)(50986999)(19580405001)(99286002)(122556002)(2900100001)(2501002)(106116001)(19625215002)(2950100001)(117636001)(86362001)(54356999)(15975445007)(76176999)(62966003)(450100001)(77156002)(87936001)(102836002);DIR:OUT;SFP:1101;SCL:1;SRVR:BN3PR0801MB1105;H:BN3PR0801MB1107.namprd08.prod.outlook.com;FPR:;SPF:None;MLV:sfv;LANG:en; Content-Type: multipart/alternative; boundary="_000_142360779773294227wantedanalyticscom_" MIME-Version: 1.0 X-OriginatorOrg: wantedanalytics.com X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Feb 2015 22:37:03.7447 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 1acae417-9944-43c8-9f27-ec40273c5a4f X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN3PR0801MB1105 X-Virus-Checked: Checked by ClamAV on apache.org --_000_142360779773294227wantedanalyticscom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable The spout emit batches of 100 ids to process, some steps are faster to be executed in batches, like fetching data from th= e database which is done with an aggregator that emit all the same rows wit= h additional values we need Trident because we need joins, merge, aggregator, etc but each batches are independant..., as my colleage said, with a maxSpoutSp= ending > 1 in our context, it's acceptable that the second batch can finish before the= first one, but currently, it waits that the first batch is completed, whic= h made our processing slower. is it possible to keep the Trident and its features, but allowing unorderin= g batch processing Is it a problem of kind of Spout, or because we use a StateUpdater at the e= nd? we tried to remove the StateUpdater and use an aggregator but it does not h= elp is it clearer? ________________________________ De : Pascal Arnal Envoy=E9 : 21 novembre 2014 12:15 =C0 : user@storm.apache.org Objet : RE: Trident topology This post of one colleague is about the same thing. https://mail-archives.apache.org/mod_mbox/storm-user/201401.mbox/%3C2730f9f= 8f8a44d16858c346886978886@BY2PR08MB144.namprd08.prod.outlook.com%3E ________________________________ De : Brunner, Bill Envoy=E9 : 21 novembre 2014 12:04 =C0 : user@storm.apache.org Objet : RE: Trident topology Still not very clear From: Pascal Arnal [mailto:pascal.arnal@wantedanalytics.com] Sent: Friday, November 21, 2014 9:33 AM To: user@storm.apache.org Subject: RE: Trident topology any help? ________________________________ De : Pascal Arnal > Envoy=E9 : 20 novembre 2014 14:01 =C0 : user@storm.apache.org Objet : RE: Trident topology If i run one topology with max spout pending of 3, actual execution of stat= eupdater is batch 1 then batch 2 then batch 3, and one new batch 4 is gener= ated after commit of batch 1, batch 5 after batch 2 .... If batch 2 finish its execution before batch 1, it should wait that batch 1= is commited. I don't want that it waits and i want the sequence in stateupdater batch 2 = then batch 1 then batch 3 ... and one new batch 4 after batch 2, batch 5 after batch 1 .... is-it more clear, and is-it possible ? Thanks ________________________________ De : P. Taylor Goetz > Envoy=E9 : 20 novembre 2014 12:53 =C0 : user@storm.apache.org Objet : Re: Trident topology Hi Pascal, I'm not sure I understand what you are asking. Could you elaborate? -Taylor On Nov 20, 2014, at 10:52 AM, Pascal Arnal > wrote: nobody for response ? Should I create one issue / feature in Jira ? ________________________________ De : Pascal Arnal > Envoy=E9 : 19 novembre 2014 10:58 =C0 : user@storm.apache.org Objet : Trident topology Hi, I try to build one topology with trident for some functions, filters and ag= gregators. I don't care about transaction and I would like that my batchs are unordere= d. I use IBatchSpout for the Spout and BaseStateUpdater for the updater with T= ridentState. Is-it possible to build one topology with my required ? May be with another state updater, or simply by using aggregator ? Thanks ________________________________ This message, and any attachments, is for the intended recipient(s) only, m= ay contain information that is privileged, confidential and/or proprietary = and subject to important terms and conditions available at http://www.banko= famerica.com/emaildisclaimer. If you are not the intended recipient, please= delete this message. --_000_142360779773294227wantedanalyticscom_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable


The spout emit batches of 100 ids to process, 

some steps are faster to be executed in batches, like fetching data fr= om the database which is done with an aggregator that emit all the same row= s with additional values

we need Trident because we need joins, merge, aggregator, etc

but each batches are independant..., as my colleage said, with a maxSp= outSpending > 1

in our context, it's acceptable that the second batch can finish befor= e the first one, but currently, it waits that the first batch is completed,= which made our processing slower.

is it possible to keep the Trident and its features, but allowing unor= dering batch processing

Is it a problem of kind of Spout, or because we use a StateUpdater at = the end?

we tried to remove the StateUpdater and use an aggregator but it does = not help

is it clearer?




De : Pascal Arnal <pasca= l.arnal@wantedanalytics.com>
Envoy=E9 : 21 novembre 2014 12:15
=C0 : user@storm.apache.org
Objet : RE: Trident topology
 

This post of one colleagu= e is about the same thing.


https://mail-archives.apac= he.org/mod_mbox/storm-user/201401.mbox/%3C2730f9f8f8a44d16858c346886978886@= BY2PR08MB144.namprd08.prod.outlook.com%3E




De : Brunner, Bill <bill= .brunner@baml.com>
Envoy=E9 : 21 novembre 2014 12:04
=C0 : user@storm.apache.org
Objet : RE: Trident topology
 

Still not very clear

 

From: Pascal Arnal [mailto:pascal.a= rnal@wantedanalytics.com]
Sent: Friday, November 21, 2014 9:33 AM
To: user@storm.apache.org
Subject: RE: Trident topology

 

any help?

 


De : Pa= scal Arnal <pascal.arnal@wantedanalytics.com= >
Envoy=E9 : 20 novembre 2014 14:01
=C0 : user@storm.apache.org
Objet : RE: Trident topology

 

If i run one topology with max spout pending of 3, actual execution of s= tateupdater is batch 1 then batch 2 then batch 3, and one new batch 4 is generated after commit of batch 1, batch 5 after batch 2 ..= ..

If batch 2 finish its execution before batch 1, it should wait that batch 1= is commited.
I don't want that it waits and i want the sequence in stateupdater batch 2 = then batch 1 then batch 3 ...
and one new batch 4 after batch 2, batch 5 after batch 1 ....

is-it more clear, and is-it possible ?

Thanks

 

 


De : P.= Taylor Goetz <ptgoetz@gmail.com>
Envoy=E9 : 20 novembre 2014 12:53
=C0 : user@storm.apache.org
Objet : Re: Trident topology

 

Hi Pascal,

 

I’m not sure I understand what you are asking. Could you ela= borate?

 

-Taylor

 

On Nov 20, 2014, at 10:52 AM, Pascal Arnal <pascal.arnal@wantedanalytics.com> wrote:



nobody for response ?
Should I create one issue / feature in Jira ?


De : Pascal Arnal <pascal.arnal@wantedanalytics.com= >
Envoy=E9 : 19 nove= mbre 2014 10:58
=C0 : user@storm.apache.org
Objet : Trident to= pology

 

Hi,

I try to build one topology with trident for some functions, filters and ag= gregators.
I don't care about transaction and I would like that my batchs are unordere= d.
I use IBatchSpout for the Spout and BaseStateUpdater for the updater with T= ridentState.

Is-it possible to build one topology with my required ?
May be with another state updater, or simply by using aggregator ?

Thanks

 


This message, and any attachments, is for the intended recipient(s) only, m= ay contain information that is privileged, confidential and/or proprietary = and subject to important terms and conditions available at http://www.banko= famerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.
--_000_142360779773294227wantedanalyticscom_--