camel-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claus Ibsen <claus.ib...@gmail.com>
Subject Re: Split large file into small files
Date Sun, 14 Aug 2011 06:22:42 GMT
On Sun, Aug 14, 2011 at 12:57 AM, jeevan.koteshwara
<jeevan.koteshwara@gmail.com> wrote:
> Hi Christian,
>                to be give you better picture, my requirement goes like
> this.
>
> I need to transfer a fixed length record file to a destination Meanwhile, my
> route is responsible for transforming into required format (say to CSV or to
> XML).
>
> Now, the input file may be too big. It may contain more records (say about
> 500k) in it. So, if I use split(body().tokenize("\n"), new
> CustomAggregationStrategy()).streaming(), may cause delay and also it may
> lead to out of memory exception while aggregating the messages.
>
> So, I thought of using split().method(CustomBean.class).streaming(), where
> my CustomBean return an Iterator (custom iterator, which will iterate
> through the input message stream and will split theincoming message based on
> line numbers) object. In this case, everything looks fine, but the end file
> will be overided with the latest splitted message, instead of appending
> every message.
>
> Cluase suggested to use "fileExist=Append" option. But, as per my
> requirement, after this split and transform process, need to do some more
> actions on the route. E.g.
>
> RouteDefinition routeDef =
> from(src).split().method(CustomeBean.class).streaming();
> routeDef = routeDef.bean(ActioneBean1()); //could be zipping action etc
> routeDef = routeDef.bean(ActionBean());
> routeDef.to(dest);
>
> In this case, if I split messages and if I didnt aggregate them, then I am
> affraid whether my action beans could perform correctly or not (I am not
> certain on this).
>

The aggregator on the splitter is only invoked when the sub message is complete.
So if you invoke your action messages as part of the sub message
routing, then your work has been done.

So would this not work for you?

from X
  split XXXX
     action bean 1
     action bean 2
     to file (append)
   end // end splitter
// after split, but no more work to do

The tricky is that the action beans is invoked with a stream type and
if they need to alter the message, then need to return a stream type
as well. So that can be tricky.

You could consider dividing this into 2 steps.

from X
  split XXXX
     to file2 (write using unique file name)

from file2
  split XXXX
     action bean 1
     action bean 2
     to file (append)

And in this 2nd route if the files are smaller and you can contain the
data in memory you can avoid using the streaming mode and work on the
entire message body in memory and from within your action beans if
thats easier.



> So, I am verifying is there any ways to aggregate the split messages
> (without using split(body().tokenize("\n"), new MyAggreagationStrategy()),
> because this will cause out of memory error).
>
>
> --
> View this message in context: http://camel.465427.n5.nabble.com/Split-large-file-into-small-files-tp4678470p4697202.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>



-- 
Claus Ibsen
-----------------
FuseSource
Email: cibsen@fusesource.com
Web: http://fusesource.com
Twitter: davsclaus, fusenews
Blog: http://davsclaus.blogspot.com/
Author of Camel in Action: http://www.manning.com/ibsen/

Mime
View raw message