lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajesh Munavalli <findm...@gmail.com>
Subject query formulation
Date Fri, 10 Feb 2006 16:31:29 GMT
Does anyone have a good way to formulate the query in terms of performance
as well as ordering of retrieved documents for the following query?

Query: "field1:t1 t2 t3 t4 AND field2:t5 t6 t7"

I want to achieve the following
* The document which matches the query exactly in both the fields gets rank
1
* The documents with different orders of the query terms get the subsequent
ranks depending on their edit distance

This can be achieved by phrase queries ANDed together.
Modified Query: BooleanQuery(PhraseQuery("field1", "t1 t2 t3 t4" SLOPE:m)
AND PhraseQuery("field2", "t5 t6 t7", SLOPE:n))

However I also want to retrieve those documents (in order) where one or more
of the terms is missing from either of the fields. i.e,

Rank 1: All terms exist in both fields with certain slope factor

Rank 2: One term missing from one of the field
            field1:t1 t2 t3 t4 AND field2:t5 t6
            field2:t1 t2 t3 t4 AND field2:t6 t7
            field2:t1 t2 t3 t4 AND field2:t5 t7
           ...
           ...

Rank 3: Two terms missing from either of the field

...


Rank n: Only one term exists in both field1 and field 2

Thanks,

Rajesh Munavalli

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message