lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Swanhart <>
Subject Re: version documents
Date Wed, 17 Nov 2004 20:38:39 GMT
Split the filename into "basefilename" and "version" and make each a keyword.

Sort your query by version descending, and only use the first
"basefile" you encounter.

On Wed, 17 Nov 2004 15:05:19 -0500, Luke Shannon
<> wrote:
> Hey all;
> I have ran into an interesting case.
> Our system has notes. These need to be indexed. They are xml files called default.xml
and are easily parsed and indexed. No problem, have been doing it all week.
> The problem is if someone edits the note, the system doesn't update the default.xml.
It creates a new file, default_1.xml (every edit creates a new file with an incremented number,
the sytem only displays the content from the highest number).
> My problem is I index all the documents and end up with terms that were taken out of
note several version ago still showing up in the query. From my point of view this makes sense
because the files are still in the content. But to a user it is confusing because they have
no idea every change they make to a note spans a new file and now the are seeing a term they
removed from their note 2 weeks ago showing up in a query.
> I have started modifying my incremental update to be look for multiple version of the
default.xml but it is more work than I thought and is going make things complex.
> Maybe there is an easier way? If I just let it run and create the index, can somebody
suggest a way I could easily scan the index folder ensuring only the default.xml with the
highest number in its filename remains (only for folders were there is more than one default.xml
file)? Or is this wishful thinking?
> Thanks,
> Luke

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message