lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Brown <>
Subject Re: negative boosts for docs with common field value
Date Tue, 11 Oct 2011 22:56:07 GMT
The setup for this question was to simplify the actual environment,
we're not actually demoting popular authors.

Perhaps index-time (negative) boosts are indeed the only way.


Web Design and Online Marketing

-----Original Message-----
From: Chris Hostetter <>
Subject: Re: negative boosts for docs with common field value
Date: Tue, 11 Oct 2011 15:37:03 -0700 (PDT)

: Some searches will obviously be saturated by docs from any given author if
: they've simply written more.
: I'd like to give a negative boost to these matches, there-by making sure that
: 1 Author doesn't saturate the results just because they've written 500
: documents, compared to others who may have only written 2-3 documents.
: The actual author value doesn't matter, I just want to bring down the score of
: docs by any common author to give more varied results.
: What's the easiest approach for this, and is it even possible at query time?
: I could do this at index time but would prefer a Solr solution.

w/o a custom plugin, the only way i know of to do something like this 
would be to index a numeric "author_prolificness" field in each doc and 
use that as the basis of a function query.

but honestly: i *really* don't think you want to do this - not if you are 
dealing with real user queries (maybe if this is for some syntheticly 
generated "related documents" or "interesting documents" query)

Imagine a user is searching for a *very* specific title (ie: "Nightfall") 
by a very prolific author ("Isaac Asimov).  What your'e describing would 
penalize the desired match just because the author is prolific -- even if 
the user types in the exact title of a document, so that some much more 
esoteric document with the same title by an author who has written nothing 
else ("Stephen Leather") would likely score higher.

I mean: if someone types in "Romeo and Juliet" do you really want to score 
documents by "Shakespeare" lower then documents by "Stanley W. Wells" just 
because Wells has written fewer total books?


View raw message