lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject RE: Indexing
Date Wed, 22 Aug 2007 14:46:20 GMT
10 updates per minute is not very much? Why not invalidate your used reader after every commit,
and reopen it? If your index is really big, you might want to reopen it fewer times, but this
is very simple to do (reopen every x updated times)

Also the RAM and FS solution Erick suggests is possible, but this is only really neccessary
for *big* indices. At [1] you can find an example of this RAM/FS methodoly working together
(see VolatileIndex, PersitentIndex and SearchIndex)

Ard

[1] http://svn.apache.org/repos/asf/jackrabbit/trunk/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/query/lucene/


I'm not reindexing the entire index. I'm just commiting the updated. But I'm
not sure how it would affect performance to commit in real time. I think
right now I have like 10 updated per minute.

On 8/22/07, Erick Erickson <erickerickson@gmail.com> wrote:
>
> There are several approaches. First, is your index small
> enough to fit in RAM? You might consider just putting it all in
> RAM and searching that.
>
> A more complex solution would be to keep the increments
> in a separate RAMDir AND your FSDir, search both and
> keep things coordinated. Something like
>
> open FSDIr
> create RAMDir
> while (whatever) {
>    get request
>    if (modification) {
>        write to FSDir and RAMDir
>   }
>    if (search) {
>      search FSDir
>      open RAMDir reader
>      search RAMDir
>      close RAMDir reader (but not writer!)
>   }
> }
>
> close FSDIr
> close RAMDir
> start again from the top.
>
>
>
> Warning: I haven't done this, but it *should* work. The sticky
> part seems to me to be coordinating deletes since the
> open FSDir may contain documents also in the RAMDir,
> but that's "an exercise for the reader"<G>,
>
> You could also define the problem away and just live
> with a 5 minute latency.
>
> Best
> Erick
>
> On 8/22/07, Jonathan Ariel <ionathan@gmail.com> wrote:
> >
> > Hi,
> > I'm new to this list. So first of all Hello to everyone!
> >
> > So right now I have a little issue I would like to discuss with you.
> > Suppose that your are in a really big application where the data in your
> > database is updated really fast. I reindex lucene every 5 min but since
> my
> > application lists everything from lucene there are like 5 minutes (in
> the
> > worse case) where I don't see new staff.
> > What do you think would be the best aproach to this problem?
> >
> > Thanks!
> >
> > Jonathan
> >
>





Mime
View raw message