lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Very high fieldNorm for a field resulting in bad results
Date Wed, 27 Sep 2006 18:41:12 GMT

: 1. Can I do away with index-time boosting for fields & tweak
: query-time boosting for them ? I understand that doc level boosting is
: very useful while indexing.
: But for fields, both index-boost & query-boost are mutiples which lead
: to the score, so would it be safe to say that I can replace the
: index-time boost with query-time boosting. This allows me a lot of
: freedom to test different values without re-indexing which takes  me
: about 6 hours.

it depends on your goal.  index time field boosts are a way to express
things like "this documents title is worth twice as much as the title of
most documents" query time boosts are a way to express "i care about
matches on this clause of my query twice as much as i do about matches to
other clauses of my query.

: 2. When searching through the archive I had read a post by you, saying
: its possible to give exact matches much higher weightage by indexing
: the START & END
: from :

the context was that even if you turn off field norms you can still some
score benefits/restrictions of matches on shorter fields vs longer fields
by indexing marker tokens (things you wouldn't expect to be regular
tokens; i used START and END just for convinience)  at the begining and
ending of hte field, and then including them in your phrase or span near
query with lots of slop ... so a values like...

	Duke Ellington
	The Duke Ellington Band

get indexed as the tokens...

	{START} {duke} {ellington} {END}
	{START} {the} {duke} {ellington} {band} {END}

...when doing a sloppy phrase or span near search for [ START, duke,
ellington, END ] both of those values will match, but the first will have
a higher score.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message