lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Polites" <>
Subject Re: updating document
Date Sat, 12 Aug 2006 16:33:58 GMT
This strategy can also be nicely abstracted from your main app.  Whilst I
haven't yet implemented it, my plan is to create a template style structure
which tells me which fields are in lucene, and which are externalized.  This
way I don't bother storing data in lucene that it stored elsewhere, but I
also don't bother storing data elsewhere that is available in lucene.

Of course I always have the source data from which I can do a total re-index
if necessary.  The end result is an abstraction of a "document" which (by
using a template) will automatically retrieve data from the index or from
external storage based on the meta information in the template.  This means
the app that uses the index no longer cares where the data is, and it can
re-index whenever it likes with confidence that everything will be handled

Unfortunately you may have transactional problems with this approach.  That
is, if you write to lucene and in the same logical transaction you write to
external storage you may have atomicity problems if one of these actions
fails.  But you can't have everything!

As long as you can "fix" the index if it gets out of sync you should be ok.

On 8/11/06, Karel Tejnora <> wrote:
> Jason is right. I think, even Im not expert on lucene too, your newly
> added document cann't recreate terms for field with analyzer, because
> field text in empty.
> There is very hairy solution - hack a IndexReader, FieldInfosWriter and
> use addIndexes.
> Lucene is "only" a fulltext search library, not a datastore. I end up
> with the same design as he suggested - At beginning I choose to store
> fields in lucene index,
> because lucene has no limit on fields length like DB2.
> The Jason's strategy is very useful in all cases - *before adding
> document* to lucene index obtain an unique id from sequence (e. g. db)
> add document with
> this synthetic id (e.g. field id and store it within index) and after
> add, save document *with id* (if it has not been saved yet) to xml or
> another storage.
> Well, I haven't stored syn. ids with indexed documents and now I'm about
> to reassign them.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message