camel-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Franz Paul Forsthofer <emc2...@googlemail.com>
Subject Re: Stream Cache and ZIP-Archives
Date Fri, 09 Oct 2015 12:46:00 GMT
Hi Hubertus,

the Stream Caches and their corresponding files are closed/deleted at
the end of the route, because it could be that some processor in the
route still needs the stream. Maybe it helps in your case to increase
the threshold so that only the first big file 400 MB is written into
the file system and that the splitted data are not written into the
file system. See http://camel.apache.org/stream-caching.html option
spoolThreshold

Best Regards Franz

On Thu, Oct 1, 2015 at 1:13 PM, Hubertus.Willuhn
<hubertus.willuhn@dinsoftware.de> wrote:
> Hi,
>
> i am new to this forum....
>
> I have a problem with the stream cache of Camel in conjunction with ZIP
> files:
>
> The starting point is a large XML file (> 400MB). This file is split and
> processed in a Camel route into smaller units. The result of this first
> part-route is a list of file objects represent the references to the actual
> ZIP archives. The ZIPs are on a Web server and get downloaded in another
> Camel Route (Seda) over HTTP.
>
> So far so good. Now the ZIP must be unpacked, to be later processed. Each
> zip file contains a different number of smaller PNG files and XML (up to 200
> Files).
>
> For performance reasons, I use Stream caching and multithreading. There are
> more than 55,000 files in total.
>
> The problem is, that the stream cache uses more and more Files (>3000) until
> Java throws an exception:
>
>
>
> It seems as if the Java process keeps too many file pointer open.
>
> My question is, is there a way to clear the cache or close the streams after
> processing all the splitted parts of one ZIP.
>
> The Route which downloads the files looks like:
>
>
>
> My Route for unzip looks like:
>
> from("seda:unzip?size=1&blockWhenFull=true")
>
>                                 // unzip
>                                 .unmarshal(zipFile)
>                                 .split(body(Iterator.class)).stopOnException().streaming()
>
>                                 // enrich
>                                 .process(uzprocessor)
>
>                                 // save
>                                 .inOnly("seda:attachment")
>                                 .end();
>
> And the last route saves the files to database (nosql storage system)
>
>                 from("seda:attachment?concurrentConsumers=30&size=30&blockWhenFull=true")
>
>                                 .process(processor)
>
>                                 .end().stop();
>
> Thx for Helping.
>
> Greetings from Germany!
>
>
>
>
> --
> View this message in context: http://camel.465427.n5.nabble.com/Stream-Cache-and-ZIP-Archives-tp5772148.html
> Sent from the Camel - Users mailing list archive at Nabble.com.

Mime
View raw message