lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: How to properly use updatedocument in lucene.
Date Thu, 31 Jan 2013 20:48:34 GMT
On Thu, Jan 31, 2013 at 7:56 AM, Trejkaz <trejkaz@trypticon.org> wrote:
> On Thu, Jan 31, 2013 at 11:05 PM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>> It's confusing, but you should never try to re-index a document you
>> retrieved from a searcher, because certain index-time details (eg,
>> whether a field was tokenized) are not preserved in the stored
>> document.
>>
>> Instead, you should re-build the document yourself, setting the right
>> details per-Field, and then re-index that.
>
> Just to check about this - if you use the same PerFieldAnalyzerWrapper
> when indexing the new document, doing it this way is safe, right?
> (With the exception of fields you originally indexed which weren't
> stored at all, of course... which would have to be reconstructed from
> some other location.)

No, it's not safe: the loaded Document will have lost details about
even the stored fields, such as whether they were Tokenized or not,
their boost, whether term vectors were indexed, etc.

In Lucene 5.0 we've changed the loaded document to have a new type
(StoredDocument) to prevent this trap ...

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message