camel-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gert Vanthienen <gert.vanthie...@skynet.be>
Subject Re: Deprecation of file consumer timestamp
Date Wed, 19 Nov 2008 10:40:26 GMT
L.S.,

It almost sounds as if we need two separate different strategies that 
can be configured on the file endpoint:
- one to determine which files need to be processed (the basic one just 
takes all the files in a directory but we can build additional ones that 
use a storage mechanisms)
- another one (like we already have now) that determines what to do with 
the file after a successful or failed exchange

FWIW, I actually like the simple noop one for creating unit tests 
because it allows you to just refer to the /src/test/resources folder in 
your project instead of having to copy them to a work folder first.

Regards,

Gert

Claus Ibsen wrote:
> Hi
>
> Oh I have thought that some end-users want FileConsumer to keep retry
> consuming the same filer over and over again if it could not be
> processed, so the postAction could have a 3rd option or we could have
> an option to set this feature (kinda like noop but only for when the
> file could not be processed)
>
>
>
> /Claus Ibsen
> Apache Camel Committer
> Blog: http://davsclaus.blogspot.com/
>
>
>
> On Wed, Nov 19, 2008 at 10:35 AM, Claus Ibsen <claus.ibsen@gmail.com> wrote:
>   
>> Hi
>>
>> The store idea is good as it can be used for the idempotent consumer
>> as well so we can use it to persist as well, so it can survive
>> restarts. We need to allow it to be pluggable so users can use a
>> shared DB if they use grid, or maybe some of that fancy terracote
>> thing that distributes memory caches.
>>
>> But turning back to the file consumer. I really think the noop=true
>> options should be deprecated as well. The file is like an inbox where
>> if a file is dropped it is consumed once. After processing the file is
>> deleted or moved to another destination. Now with this "remember list"
>> we have a serious issue if the inbox receives file with the same name
>> but the content of the file is different. What if someone uploads a
>> file to a FTP server and the filename is always fixed (= the same).
>> Now we have a complex situation as we need to hash the file content to
>> be able to determine if the file is different, or not support it at
>> all.
>>
>> I am mostly keen to keep it simpler and as Hadrian said "keep it lean".
>>
>> So I am voting for:
>> a) to remove noop as wel
>> b) to always delete or move file after processing (we should support
>> moving files to a different folder if exchange failed)
>>
>> Ad b)
>> We should support moving files using different pattern depending on
>> - exchange OK
>> - exchange Failed
>> I have though about introducing some better URI options to express this
>>
>> Something along the lines of (think of better uri option names)
>> postAction=delete
>>
>> postAction=move
>> moveCompleteExpression=./done/${file:name}.bak
>> moveErrorExpression=./error/${date:now:yyyyMMdd}/${file:name}.error
>>
>> And we should have defaults as well, so if moveErrorExpression is
>> omitted it defaults to the completed move.
>>
>>
>> And then we could consider @deprecating all the other pre and postfix
>> URI option we have in favor of the power of the expression instead.
>>
>>
>>
>> But the list store is not wasted as we can use it for the idempotent
>> as well and for other areas.
>>
>>
>>
>>
>>
>> /Claus Ibsen
>> Apache Camel Committer
>> Blog: http://davsclaus.blogspot.com/
>>
>>
>>
>> On Wed, Nov 19, 2008 at 4:04 AM, Jon Anstey <janstey@gmail.com> wrote:
>>     
>>> Hmmm... yeah, I like this suggestion. It may be just what we need here!
>>> Thanks!
>>>
>>> On Tue, Nov 18, 2008 at 4:11 PM, Gert Vanthienen
>>> <gert.vanthienen@skynet.be>wrote:
>>>
>>>       
>>>> Jon,
>>>>
>>>> How about if we enhance the file consumer to keep track of files that have
>>>> already been processed instead of using a timestamp?  The timestamp approach
>>>> is a bit error-prone (just touching the file by accident can set it off
>>>> again).
>>>> If we provide multiple implementations for the storage mechanism to keep
>>>> this information, we can cover a lot of use cases (similar to the message
id
>>>> store for an idempotent consumer):
>>>> - an in-memory store for testing purposes
>>>> - a file-based implementation for basic production environments
>>>> - a database- or ldap-backed implementation for clustered environments,
>>>> where a file can arrive through multiple directories
>>>>
>>>> Regards,
>>>>
>>>> Gert
>>>>
>>>> Jon Anstey schreef:
>>>>
>>>>  The algorithm that checks whether a file should be consumed based on
>>>>         
>>>>> timestamp has been deprecated for a while now (see
>>>>> http://activemq.apache.org/camel/file.html). I've removed this on my
>>>>> local
>>>>> branch only to realize that it introduces a bit of an ugly problem...
>>>>> essentially since files will be processed always (modified or not) in
the
>>>>> case of noop=true or if a fault has been set, the same file will be
>>>>> processed over and over again... not good!
>>>>>
>>>>> The original intent of removing the timestamp checking was to simplify
the
>>>>> consumer. I think that in trying to get around this new issue we may
make
>>>>> it
>>>>> even more complicated!
>>>>>
>>>>> I'm wondering if there is a simple solution to this that I'm just not
>>>>> seeing
>>>>> yet or if maybe this issue was discussed before...
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>         
>>> --
>>> Cheers,
>>> Jon
>>>
>>> http://janstey.blogspot.com/
>>>
>>>       
>
>   


Mime
View raw message