lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Indexing in pieces?
Date Mon, 03 Sep 2007 20:14:08 GMT
See below..

On 8/31/07, Berlin Brown <berlin.brown@gmail.com> wrote:
>
> So I am assuming that is not just a matter of "indexing" to that same
> directory as you "indexed" before.


No, that's all it is. When you open an index, for writing, there
is a flag indicating "overwrite or append". So if
you can just select new records from your index that
aren't already in your index, you can easily just add
the new ones. This assumes that each message is a
lucene document..


So, based on what you are saying, you would have to reload the
> previous index (eg, INDEX_DIR_OLD) and then index the new content.
> When I mean "index", I am talking about actually invoking lucene to
> merge the content.
>
> For example, it isnt just a matter of indexing to index_dir_old and
> then to index_dir_new and then copying the lucene index files into
> another directory index_dir_cur.


You don't need to copy that much. Just open the current
index and append more records. You can still search
the index even as you are adding new documents, although
you'll have to close and reopen your *reader* to see the
new content.


On 8/31/07, Chris Lu <chris.lu@gmail.com> wrote:
> > I think you can simply change you sql to select only the recently
> updated
> > messages, and add to your existing index. Although adding to an existing
>
> > large index also takes a long time, it should be quicker than
> re-building
> > the whole index.
> >
> > If your index continues to grow, you may need to have a dedicated server
> for
> > indexing and searching.
> >
> > --
> > Chris Lu
> > -------------------------
> > Instant Scalable Full-Text Search On Any Database/Application
> > site: http://www.dbsight.net
> > demo: http://search.dbsight.com
> > Lucene Database Search in 3 minutes:
> > http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>
> >
> > On 8/31/07, bbrown <bbrown@botspiritcompany.com> wrote:
> > >
> > > I have been fine with my database (discussion forum) to lucene.  I am
> > > taking
> > > the simplest approach, eg; I have a discussion forum which are just
> text
> > > messages, I take those out of the databse and then index the content.
> > >
> > > I am having troubling because I have hundreds of thousands of messages
> and
> > > it
> > > takes a while, eating my server cpu.  I was thinking I would just
> index
> > > say a
> > > portion of the database.  For example, index records 1-100 and then
> > > 101-200.
> > > Can I just index to that index directory without deleting the existing
> > > index
> > > segment files that are already there?  Or is it more complicated than
> > > that.
> > >
> > > --
> > > Berlin Brown
> > > [berlin dot brown at gmail dot com]
> > > http://botspiritcompany.com/botlist/?
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
>
>
> --
> Berlin Brown
> http://www.newspiritcompany.com - newspirit technologies
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message