lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jong Kim <jong.luc...@gmail.com>
Subject Re: Re-indexing a particular field only without re-indexing the entire enclosing document in the index
Date Mon, 23 Apr 2012 20:25:59 GMT
Thanks for the reply.

Our metadata is not stored in a single field, but is rather a collection of
fields. So, it requires a boolean search that spans multiple fields. My
understanding is that it is not possible to iterate over the matching
documents efficiently using termDocs() when the search involves multiple
terms and/or multiple fields, right?

/Jong

On Mon, Apr 23, 2012 at 11:58 AM, Earl Hood <earl@earlhood.com> wrote:

> On Mon, Apr 23, 2012 at 10:31 AM, Jong Kim wrote:
>
> > Is there any good way to solve this design problem? Obviously, an
> > alternative design would be to split the index into two, and maintain
> > static (and large) data in one index and the other dynamic part in the
> > other index. However, this approach is not acceptable due to our data
> > pattern where the match on the first index yields very large result set,
> > and filtering them against the second index is very inefficient due to
> high
> > ratio of disjoint data. In other word, while the alternate approach
> > significantly reduces the indexing-time overhead, resulting search is
> > unacceptably expensive.
>
> Have you tested to verify it is expensive?  If the meta document is
> identified with a unique ID (that can be stored with the main document
> so you know which meta document to retrieve), accessing the meta
> document should be fairly efficient.
>
> In the project I'm on (we are using Lucen 3.0.3), we just use
> InderReader.termDocs() to retrieve a document based on a unique ID we
> store in one of the documents fields.
>
> --ewh
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message