camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jyrki Ruuskanen (JIRA)" <>
Subject [jira] [Commented] (CAMEL-8421) Add minimum age option to readLock=changed
Date Tue, 03 Mar 2015 11:26:05 GMT


Jyrki Ruuskanen commented on CAMEL-8421:

The unit tests for the File component worked for me, but they are based on timing so the results
might be unstable. I might have to increase the tolerances.

FTP unit tests, on the other hand, all failed (not just mine) on two machines (win7 and archlinux)
I tried them on. So I couldn't verify the tests. I ran the tests with camel/components/camel-ftp>
mvn test -Dtest=*ReadLock*. FTP test support doesn't seem to work out of the box at the moment.

The readability might improve if this was separated into its own readLockStrategy like readLock=minage
or readLock=olderthan. What do you think?

> Add minimum age option to readLock=changed
> ------------------------------------------
>                 Key: CAMEL-8421
>                 URL:
>             Project: Camel
>          Issue Type: Improvement
>          Components: camel-core, camel-ftp
>            Reporter: Jyrki Ruuskanen
>            Assignee: Claus Ibsen
>            Priority: Minor
>             Fix For: Future
> I'm a fan of noop=true in file consumers since it means I don't have to worry about how
many readers I have and where. But eventually I came across a scenario where current features
are not sufficient.
> Let's say we have a source system which writes files with name <timestamp>_something.xml,
and it won't use temp files or .done marker files or anything like that. We want to get the
latest file as soon as it's created. Consider the following route:
> {code}
> from("file:////somewhere/data?noop=true&include=.*_something[.]xml&readLock=changed&sortBy=file:name")
> 	.aggregate(constant(true), new UseLatestAggregationStrategy()).completionFromBatchConsumer()
> 		.to("amq:topic:something");
> {code}
> When this route is started it will go through the files in order and get the last one.
Then it will wait for new files. This works fine as long as the writer is not "slow".
> Now, we had cases of incomplete files being read and I was requested to not to read the
file before it is 10 minutes old, just in case. If I increase readLockCheckInterval to 10
minutes getting to the latest file at route startup will take close to forever. The current
readLock=changed implementation always sleeps for at least one readLockCheckInterval per file.
> If we had readLockMinAge option to define the minimum age for the target file the consumer
could acquire readLock on the first poll and breeze through the files until too young a file
is reached.
> The route below would poll a file every 500ms (default poll delay), while the current
readLock=changed would take 1500ms (default poll delay + default readLockCheckInterval) per
file. Consumer goes through the files until it hits the end and gets the last one as soon
as it becomes old enough.
> {code}
> from("file:////somewhere/data?noop=true&include=.*_something[.]xml&readLock=changed&readLockMinAge=600000&sortBy=file:name")
> 	.aggregate(constant(true), new UseLatestAggregationStrategy()).completionFromBatchConsumer()
> 		.to("amq:topic:something");
> {code}

This message was sent by Atlassian JIRA

View raw message