Query independent means that the threshold should have the same relevance for all queries and
discard found docs below it. Current scoring implementation doesn't give guaranties that,
say two documents found in two queries and which got the same score 0.5 are of the same quality.
I don't want discarding docs from being indexed, no. But I want to be sure that two docs with
the same score in two different queries have the same quality (they contain the same set of
found terms, lenght etc.)
Alexander
-----Original Message-----
From: Andrzej Bialecki <ab@getopt.org>
To: java-dev@lucene.apache.org
Date: Thu, 07 Aug 2008 22:44:46 +0200
Subject: Re: lucene scoring
Александр Аристов wrote:
> I want implement searching with ability to set so-called a confidence
> level below which I would treat documents as garbage. I cannot defile
> the level per query as the level should be relevant for all
> documents.
Hmm .. I'm not sure if I understand it properly - if the level is
query-independent, then it's a constant factor, which you can put in a
field during the index creation - and then you could use a Filter or
FunctionQuery to exclude documents with this factor below the threshold.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
|