lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <oh...@cox.net>
Subject Re: Seeking guidance for updating indexes
Date Fri, 31 Jul 2009 20:32:32 GMT
Hi,

Phil and Ian,

Thanks for the responses and confirmations about this.  

Assuming that our requirements (as I described earlier) don't change, it looks like this updating/inserting
thing should be pretty easy :)!

Later, and have a great weekend!

Jim



---- Phil Whelan <phil123@gmail.com> wrote: 
> Hi Jim,
> 
> There should not be much difference from the lucene end between a new
> index and index you want to update (add more documents to). As stated
> in the Lucene docs IndexWriter will create the index "if it does not
> already exist".
> 
>    http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/index/IndexWriter.html
>    IndexWriter(Directory d, Analyzer a, IndexWriter.MaxFieldLength mfl)
>           Constructs an IndexWriter for the index in d, first creating
> it if it does not already exist.
> 
> Yes, you can search an index while adding to the index. But you will
> see a snapshot of the index at the time when you opened the searcher.
> You will need to re-open it to see changes that have been added since
> you last opened the searcher.
> 
> Lucene is very tolerant to most things. Just be careful not to have 2
> index writers writing to the same index and you should be ok. Even in
> that situation Lucene will just throw an Exception. I've been playing
> with Lucene for a long time and I've never corrupted an index yet,
> even when I do stupid things.
> 
> Thanks,
> Phil
> 
> On Fri, Jul 31, 2009 at 9:42 AM, <ohaya@cox.net> wrote:
> > Hi,
> >
> > I still am new to Lucene, but I think I have an initial indexer app (based on the
demo IndexFiles app) working, and also have a web app, based on the demo luceneweb web app
working.
> >
> > I'm still busy tweaking both, but am starting to think ahead, about operational
type issues, esp. updating indexes.
> >
> > The situation I have is a little specific.  In particular, once a document is indexed
via Lucene, we will, theoretically, never need to or want to remove that document.  But,
we will have new documents that will need to be added periodically.
> >
> > In other words, I think the terminology would be that we woud just be "inserting"
documents (and updating the Lucene index), never "updating" or "deleting" documents.
> >
> > From some research I've done, it seems like the way to accomplish this would be
to just add the new documents, using Document.add(), as I did with the initial indexer, but
having a new "update" app that makes sure that it is only adding documents that have not been
added previously.
> >
> > Is this correct?
> >
> > Assuming that the above is correct, is it going to be possible to keep the search
web app running while the new update app is doing its job?
> >
> > Are there things that I need to worry about in the update app, such as locking,
etc.?   Note that we would only have a single update app running, i.e., we won't have any
situations where we'd have multiple updates running simultaneously.
> >
> > If so, what are they?
> >
> > Specifically, what I'm looking for is, other than ensuring not to add previously-added
documents, what is different between the original indexer code and the update indexer code?
> >
> > Thanks,
> > Jim
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message