Return-Path: X-Original-To: apmail-camel-users-archive@www.apache.org Delivered-To: apmail-camel-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D377887C1 for ; Sun, 14 Aug 2011 06:23:43 +0000 (UTC) Received: (qmail 86903 invoked by uid 500); 14 Aug 2011 06:23:42 -0000 Delivered-To: apmail-camel-users-archive@camel.apache.org Received: (qmail 86644 invoked by uid 500); 14 Aug 2011 06:23:31 -0000 Mailing-List: contact users-help@camel.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@camel.apache.org Delivered-To: mailing list users@camel.apache.org Received: (qmail 86636 invoked by uid 99); 14 Aug 2011 06:23:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Aug 2011 06:23:28 +0000 X-ASF-Spam-Status: No, hits=0.6 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of claus.ibsen@gmail.com designates 209.85.218.45 as permitted sender) Received: from [209.85.218.45] (HELO mail-yi0-f45.google.com) (209.85.218.45) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Aug 2011 06:23:23 +0000 Received: by yih10 with SMTP id 10so3156697yih.32 for ; Sat, 13 Aug 2011 23:23:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=xoZ05AYTlq03ldC1WYud87crjOevKLJrk2n64NavAZM=; b=RTDA/SuxjSlaQzca5Ie/f23JBUCno9bWgvx9wmun3XA6hhVONyVAbGE/SzpTyRKR7P nacODQjDJ3i2ebXWrsMovkdHRNyBZCfAsP6RkVT0Kuqm9+nDRbabOF0f44agth0pnt3r fQaBxOdRZEtciaIx8tUaYplXaFXm67Z9+nWM4= Received: by 10.100.109.10 with SMTP id h10mr1962384anc.80.1313302982192; Sat, 13 Aug 2011 23:23:02 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.34.5 with HTTP; Sat, 13 Aug 2011 23:22:42 -0700 (PDT) In-Reply-To: <1313276228478-4697202.post@n5.nabble.com> References: <1312824598394-4678470.post@n5.nabble.com> <1312879932815-4681261.post@n5.nabble.com> <1313276228478-4697202.post@n5.nabble.com> From: Claus Ibsen Date: Sun, 14 Aug 2011 08:22:42 +0200 Message-ID: Subject: Re: Split large file into small files To: users@camel.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Sun, Aug 14, 2011 at 12:57 AM, jeevan.koteshwara wrote: > Hi Christian, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0to be give you better picture, my requirem= ent goes like > this. > > I need to transfer a fixed length record file to a destination Meanwhile,= my > route is responsible for transforming into required format (say to CSV or= to > XML). > > Now, the input file may be too big. It may contain more records (say abou= t > 500k) in it. So, if I use split(body().tokenize("\n"), new > CustomAggregationStrategy()).streaming(), may cause delay and also it may > lead to out of memory exception while aggregating the messages. > > So, I thought of using split().method(CustomBean.class).streaming(), wher= e > my CustomBean return an Iterator (custom iterator, which will iterate > through the input message stream and will split theincoming message based= on > line numbers) object. In this case, everything looks fine, but the end fi= le > will be overided with the latest splitted message, instead of appending > every message. > > Cluase suggested to use "fileExist=3DAppend" option. But, as per my > requirement, after this split and transform process, need to do some more > actions on the route. E.g. > > RouteDefinition routeDef =3D > from(src).split().method(CustomeBean.class).streaming(); > routeDef =3D routeDef.bean(ActioneBean1()); //could be zipping action etc > routeDef =3D routeDef.bean(ActionBean()); > routeDef.to(dest); > > In this case, if I split messages and if I didnt aggregate them, then I a= m > affraid whether my action beans could perform correctly or not (I am not > certain on this). > The aggregator on the splitter is only invoked when the sub message is comp= lete. So if you invoke your action messages as part of the sub message routing, then your work has been done. So would this not work for you? from X split XXXX action bean 1 action bean 2 to file (append) end // end splitter // after split, but no more work to do The tricky is that the action beans is invoked with a stream type and if they need to alter the message, then need to return a stream type as well. So that can be tricky. You could consider dividing this into 2 steps. from X split XXXX to file2 (write using unique file name) from file2 split XXXX action bean 1 action bean 2 to file (append) And in this 2nd route if the files are smaller and you can contain the data in memory you can avoid using the streaming mode and work on the entire message body in memory and from within your action beans if thats easier. > So, I am verifying is there any ways to aggregate the split messages > (without using split(body().tokenize("\n"), new MyAggreagationStrategy())= , > because this will cause out of memory error). > > > -- > View this message in context: http://camel.465427.n5.nabble.com/Split-lar= ge-file-into-small-files-tp4678470p4697202.html > Sent from the Camel - Users mailing list archive at Nabble.com. > --=20 Claus Ibsen ----------------- FuseSource Email: cibsen@fusesource.com Web: http://fusesource.com Twitter: davsclaus, fusenews Blog: http://davsclaus.blogspot.com/ Author of Camel in Action: http://www.manning.com/ibsen/