lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <>
Subject Re: Re Indexing
Date Thu, 23 Feb 2006 15:40:53 GMT
The approach I am currently using is (pseudo code):

  select count(*) from docs 
      where date_modified > lastIndexRunDate

  if ((countChangedOrNew/reader.numDocs) >50%)
         //quicker to rebuild the whole index
         Select * from docs
         for (each record)
           writer.addDoc(new Doc(record));
        //patch the data 

        //first delete any docs in index
         select id from docs where 
         for(each id)
             reader.delete(new Term("dbkey",id);

         //now add docs
         select * from docs where  
       for (each record)
           writer.addDoc(new Doc(record));
  save lastIndexRunDate;

We've found there are database-specific JDBC streaming
settings that help when reading huge volumes of

--- N <> wrote:

> Hi
> I am indexing database tables with huge data via
> Lucene. Do I need to reindex  the whole table(s) as
> changes are made to keep the search up to date..?
> since it is time consuming to create new index every
> time from scratch when the data is modified in the
> tables, can anybody suggest some workaround for
> efficient method?
> Thanks in advance
> Noon
> ---------------------------------
> Relax. Yahoo! Mail virus scanning helps detect nasty

Win a BlackBerry device from O2 with Yahoo!. Enter now.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message