lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: update/re-add an existing document with numeric fields
Date Thu, 10 May 2012 09:58:30 GMT
This is actually due to a bug:

    https://issues.apache.org/jira/browse/LUCENE-3065

which was ixed in 3.2.  The bug is that, prior to Lucene 3.2, if you
stored a NumericField, when you later load that document, the field is
converted to an ordinary Field (no longer numeric), so when you then
index that retrieved document you lost its numeric-ness.

That said, retrieving a doc and reindexing it is dangerous because in
general Lucene does not ensure all details are preserved.  For
example, boost is never returned correctly, whether a field was
indexed, and whether term vectors were indexed, are all not preserved.
 So in general you shouldn't assume you can just load a document,
modify it a bit, re-index it, and not lose something...

Mike McCandless

http://blog.mikemccandless.com

On Wed, May 9, 2012 at 1:33 PM, Tim Eck <timeck@gmail.com> wrote:
> Note: I'm bound to lucene 3.0.3 for the context of this question, but
> I would be interested to know if newer versions would help me here.
>
> I have an existing document in my directory that has one regular
> String field and one numeric field. I naively thought I could update
> that document to change the String field with code like this:
>
>  FSDirectory dir = FSDirectory.open(...);
>  IndexWriter writer = new IndexWriter(dir, new
>      StandardAnalyzer(Version.LUCENE_30), MaxFieldLength.UNLIMITED);
>
>  // doc has 2 fields, one String and the other numeric
>  Document doc = new Document();
>  doc.add(new Field("string", "value", Store.YES,
>      Index.ANALYZED_NO_NORMS));
>  NumericField nf = new NumericField("numeric", Field.Store.YES,
>      true);
>  nf.setIntValue(42);
>  doc.add(nf);
>  writer.addDocument(doc);
>  writer.commit();
>
>  // make sure we can query on the numeric field
>  IndexSearcher searcher = new IndexSearcher(dir);
>  TopDocs docs = searcher.search(new TermQuery(new Term("numeric",
>      NumericUtils.intToPrefixCoded(42))), 1);
>  if (docs.totalHits != 1) {
>      throw new AssertionError();
>  }
>  doc = searcher.doc(docs.scoreDocs[0].doc);
>  searcher.close();
>
>  // update document with new value for string field
>  doc.removeField("string");
>  doc.add(new Field("string", "value2", Store.YES,
>      Index.ANALYZED_NO_NORMS));
>  writer.updateDocument(new Term("string", "value"), doc);
>  writer.commit();
>
>  // search again
>  searcher = new IndexSearcher(dir);
>  docs = searcher.search(new TermQuery(new Term("numeric",
>      NumericUtils.intToPrefixCoded(42))), 1);
>  if (docs.totalHits != 1) {
>      throw new AssertionError(docs.totalHits);
>  }
>
>
> That doesn't seem to work however. It seems I need to get the
> NumericField rematerialized in the document passed to
> updateDocument(). I was hoping to avoid that if possible so
> I'm looking for any suggestions someone might offer.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message