lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Shannon" <lshan...@hypermedia.com>
Subject Re: version documents
Date Wed, 17 Nov 2004 22:16:00 GMT
That is a good idea. Thanks!

----- Original Message ----- 
From: "Justin Swanhart" <greenlion@gmail.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Wednesday, November 17, 2004 3:38 PM
Subject: Re: version documents


> Split the filename into "basefilename" and "version" and make each a
keyword.
>
> Sort your query by version descending, and only use the first
> "basefile" you encounter.
>
> On Wed, 17 Nov 2004 15:05:19 -0500, Luke Shannon
> <lshannon@hypermedia.com> wrote:
> > Hey all;
> >
> > I have ran into an interesting case.
> >
> > Our system has notes. These need to be indexed. They are xml files
called default.xml and are easily parsed and indexed. No problem, have been
doing it all week.
> >
> > The problem is if someone edits the note, the system doesn't update the
default.xml. It creates a new file, default_1.xml (every edit creates a new
file with an incremented number, the sytem only displays the content from
the highest number).
> >
> > My problem is I index all the documents and end up with terms that were
taken out of note several version ago still showing up in the query. From my
point of view this makes sense because the files are still in the content.
But to a user it is confusing because they have no idea every change they
make to a note spans a new file and now the are seeing a term they removed
from their note 2 weeks ago showing up in a query.
> >
> > I have started modifying my incremental update to be look for multiple
version of the default.xml but it is more work than I thought and is going
make things complex.
> >
> > Maybe there is an easier way? If I just let it run and create the index,
can somebody suggest a way I could easily scan the index folder ensuring
only the default.xml with the highest number in its filename remains (only
for folders were there is more than one default.xml file)? Or is this
wishful thinking?
> >
> > Thanks,
> >
> > Luke
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message