lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: best way to interest two queries?
Date Wed, 12 May 2010 08:55:14 GMT


>>two terminology questions:

>>- is multiplier in the mail mentioned there the same as boost?

This factor controls how many decimal places precision is retained in the adjusted scores.
Pick to low a multiplier and scores that are only differentiated by a very small value will
appear equal. Pick too high a multiplier and you start to lose the most significant parts
of the score. This trade-off is summarised here for various settings of "multiplier":

multiplier       max score   fraction precision
======   ========   =============
10           838860         0.x
100         83886              0.xx
1000       8388             0.xxx
10000     838               0.xxxx

The default setting of 1000 seems like a safe setting for the typical scores generated by
Lucene.

- I intended to use prefix and fuzzyqueries. I believe this is contradictory to this or?

You can wrap any queries with this class - the only limitation is it hides all match info
in a single byte encoded into the score which only allows for 8 bits or 8 match flags i.e.
reports on max 8 clauses. You could try use > 8 bits encoded into the score but then you
lose more score precision again (see above).

Some thoughts on a less bit-twiddly, more robust approach:
Having played with the new Attribute stuff in 2.9/3.0 Analyzers recently I am intrigued with
using a similar approach to capture low-level match metadata  i.e. clients decide what types
of MatchAttributes are of interest and Query objects record match metadata in singleton MatchAttribute
objects as they stream their way through result sets.
Result set streaming and tokenisation streams are similar problems and the Attribute design
seems like it can apply here.

Cheers
Mark

Le 11-mai-10 à 12:02, mark harwood a écrit :

> See https://issues.apache.org/jira/browse/LUCENE-1999
> 
> 
> 
> ----- Original Message ----
> From: Paul Libbrecht <paul@activemath.org>
> To: java-user@lucene.apache.org
> Sent: Tue, 11 May, 2010 10:52:14
> Subject: Re: best way to interest two queries?
> 
> Dear lucene experts,
> 
> Let me try to make this precise since there was not answer.
> 
> I have a query that's, about,
>  a & b & c
> and I have a good search result.
> Now I want to know:
> 
> a) for the first page, which matches are matches for a, b, or c
> b) for the remaining results (for the "tail"), are there matches of a, b, or c
> 
> Thus far, I'd only know the usage of the highlighter to go to fields, it's not exactly
the same and it's slow.
> I know I could use termDocs or another search-result for a,b, and c, probably to annotate
my initial results list; that could work well for a).
> 
> I still don't know what to do for b).
> 
> thanks for hints.
> 
> paul
> 
> Le 31-mars-10 à 23:00, Paul Libbrecht a écrit :
>> I've been wandering around but I see no solution yet: I would like to intersect two
query results: going through the list of one query and indicating which ones actually match
the other query or, even better, indicating that "passed this, nothing matches that query
anymore".
>> 
>> What should be the strategy?
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message