lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bruno Dery" <>
Subject RE: Document boost, is it working?
Date Wed, 31 Oct 2007 16:27:38 GMT
Thanks! I also noticed there is a mention of this in the documentation
of Document.getBoost():

"Note: This value is not stored directly with the document in the index.
Documents returned from IndexReader.document(int) and Hits.doc(int) may
thus not have the same value present as when this document was indexed."

Then maybe it should simply be removed from Luke's display as you

-----Original Message-----
From: Andrzej Bialecki [] 
Sent: Wednesday, October 31, 2007 4:13 AM
Subject: Re: Document boost, is it working?

Bruno Dery wrote:
> Thanks for the help, you're right your example works. However looking 
> in Luke I also see only ones (1 1 1) as the document boost.

Then perhaps this value should be removed from the Luke's display ... 
because it will always read 1, and it's a correct value (see below).

> I imagine Luke use's Lucene's Document.getBoost() function. Shouldn't 
> this be considered a bug, as I'd expect to retrieve the same boost 
> number (or at least some factor of it) when I look at my documents 
> once indexed? Or perhaps there is something I'm not understanding 
> about this Document.getBoost function, normally you expect a getter to

> return you the exact value you entered with the setter.

Document.setBoost / getBoost is _only_ guaranteed to return the same 
value if a document is newly created, i.e. it hasn't been stored & 
retrieved. This is related to the way Lucene stores boost values in the 
index - in order to save storage space the boost value is never stored 
explicitly, instead during the storing process it's multiplied by every 
field's lengthNorm, and stored together as a single-byte float.

(Actually, it's a function of the current index format - perhaps in the 
future Lucene will be able to store these values separately using 
another type of storage. So far there was no pressing need to do this).

Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web ___|||__||
\|  ||  |  Embedded Unix, System Integration
Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message