lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Indexing
Date Wed, 22 Aug 2007 14:43:40 GMT
There's nothing in the "complex" solution that
has anything to do with recreating the index..... The RAMDir
is really the delta between the FSDIr and the updates.

Keeping it all in RAM really consists of reading the FSDIR
into RAM and updating both.

Erick

On 8/22/07, Jonathan Ariel <ionathan@gmail.com> wrote:
>
> I'm not reindexing the entire index. I'm just commiting the updated. But
> I'm
> not sure how it would affect performance to commit in real time. I think
> right now I have like 10 updated per minute.
>
> On 8/22/07, Erick Erickson <erickerickson@gmail.com> wrote:
> >
> > There are several approaches. First, is your index small
> > enough to fit in RAM? You might consider just putting it all in
> > RAM and searching that.
> >
> > A more complex solution would be to keep the increments
> > in a separate RAMDir AND your FSDir, search both and
> > keep things coordinated. Something like
> >
> > open FSDIr
> > create RAMDir
> > while (whatever) {
> >    get request
> >    if (modification) {
> >        write to FSDir and RAMDir
> >   }
> >    if (search) {
> >      search FSDir
> >      open RAMDir reader
> >      search RAMDir
> >      close RAMDir reader (but not writer!)
> >   }
> > }
> >
> > close FSDIr
> > close RAMDir
> > start again from the top.
> >
> >
> >
> > Warning: I haven't done this, but it *should* work. The sticky
> > part seems to me to be coordinating deletes since the
> > open FSDir may contain documents also in the RAMDir,
> > but that's "an exercise for the reader"<G>,
> >
> > You could also define the problem away and just live
> > with a 5 minute latency.
> >
> > Best
> > Erick
> >
> > On 8/22/07, Jonathan Ariel <ionathan@gmail.com> wrote:
> > >
> > > Hi,
> > > I'm new to this list. So first of all Hello to everyone!
> > >
> > > So right now I have a little issue I would like to discuss with you.
> > > Suppose that your are in a really big application where the data in
> your
> > > database is updated really fast. I reindex lucene every 5 min but
> since
> > my
> > > application lists everything from lucene there are like 5 minutes (in
> > the
> > > worse case) where I don't see new staff.
> > > What do you think would be the best aproach to this problem?
> > >
> > > Thanks!
> > >
> > > Jonathan
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message