lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Extracting a subset of an index
Date Tue, 03 Apr 2007 15:24:39 GMT
In the immortal words of Erik H.  ...it depends...

The big issue is whether you have fields in your index that are NOT
stored (i.e. Field.Store.NO). If this is the case, your documents
will not be complete, and adding it to the fresh index will not
include the un-stored data.

It's actually pretty common to store a field as
...Field.Store.NO, Field.Index.(UN_)TOKENIZED

Field.Store.COMPRESSED should be OK.

>From the Document API doc

"note that fields which are *not*
stored<file:///C:/lucene-2.1.0/docs/api/org/apache/lucene/document/Fieldable.html#isStored%28%29>are
*not* available in
documents retrieved from the index, e.g. with
Hits.doc(int)<file:///C:/lucene-2.1.0/docs/api/org/apache/lucene/search/Hits.html#doc%28int%29>,
Searcher.doc(int)<file:///C:/lucene-2.1.0/docs/api/org/apache/lucene/search/Searcher.html#doc%28int%29>or
IndexReader.document(int)<file:///C:/lucene-2.1.0/docs/api/org/apache/lucene/index/IndexReader.html#document%28int%29>
."

Otherwise, it would *probably* work, but I haven't tried it. At worst,
you could create a new document and add the fields from the old
document to it......

Best
Erick

On 4/3/07, jafarim <jafarim@gmail.com> wrote:
>
> Hi folks,
> I need to extract a subset of an index so that I can move some documents
> to
> another isolated machine to be searched locally. I'm not sure whether the
> following scenario is correct:
> - extracting the documents from the index by using one of the doc(i)
> methods
> - adding the same Document objects to a fresh index.
>
> Am I right?
>
> --Jaf
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message