lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martijn v Groningen <>
Subject Re: Patch submission for DataImportHandler's FileListEntityProcessor to sort files
Date Tue, 25 Oct 2011 06:31:04 GMT
Hi Gabriel,

I'm not an expert FileEntityProcessor user, but I'd expect a
consistent process order. Your code seems "kosher" to me. You use the
last modified date as order, which seems ok to me. So create a Jira
issue and attach your patch!


On 24 October 2011 21:49, Gabriel Cooper <> wrote:
> Hello,
> I noticed what appears to be a bug in DataImportHandler's
> FileListEntityProcessor. Specifically, it relies on Java's File.list()
> method to retrieve a list of files from the configured dataimport directory,
> but list() does not guarantee a sort order. This means that if you have two
> files that update the same record, the results are non-deterministic.
> Typically, list() does in fact return them lexigraphically sorted, but this
> is not guaranteed.
> An example of how you can get into trouble is to imagine the following:
> xyz.xml -- Created one hour ago. Contains updates to records "Foo" and
> "Bar".
> abc.xml -- Created one minute ago. Contains updates to records "Bar" and
> "Baz".
> In this case, the newest file, in abc.xml, would (likely, but not
> guaranteed) be run first, updating the "Bar" and "Baz" records. Next, the
> older file, xyz.xml, would update "Foo" and overwrite "Bar" with outdated
> changes.
> The "HowToContribute" wiki page suggested I send my request here before
> opening an actual bug ticket, so please let me know if there's anything else
> I can or should do to get this patch submitted and approved. I've attached a
> patch of FileListEntityProcessor, along with an updated test, please let me
> know if it's kosher.
> Thank you,
> Gabriel.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Met vriendelijke groet,

Martijn van Groningen

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message