lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Test new query parser?
Date Thu, 24 Aug 2006 12:59:33 GMT
It is interesting to note that Lucene would also seem to suffer from 
bugs when using spans if you only have a single document in the index. 
At least with the NotSpanQuery, the spans could wrap around the document 
from end to beginning. This would be unexpected but would also go away 
if you add a second doc. This is a limitation (if you could call it 
that) of the spans implementation and not a bug though (I'm speculating 
on all of this).

Hopefully my new idea will sort this problem out. This is a non-work 
project so I don't have much time for it, but it is much more 
interesting than my work so I will probably keep plugging away on off hours.

The new idea is to collect the terms in a composite object that will 
maintain order of operations as well as the tokens.

I have much to do on this composite class, but this query seems to 
generate an acceptable composite:

(me | cop & her) ~2 (old & women) :

cdistrib(cdistrib(distrib(allFields:me)' 
'cdistrib(distrib(allFields:cop)'+'distrib(allFields:her)))) ~3 
cdistrib(distrib(allFields:old)'+'distrib(allFields:women))

Then I will need a function that will distribute one composite across 
another, creating the proximity query (I would guess a boolean at its 
root). I am confident I can make the composite class work, but I have 
not investigated how hard it will be to do the distribution. All of the 
information is there though, so I am assuming it won't be too difficult 
(don't i wish).

- Mark
> Mark --
>
> Don't lose hope! We are migrating from Verity to Lucene, and I know for
> a fact that we will have to support the kind of complex queries you talk
> about; maybe not /quite/ as complex as your magnificent:
>
>> cop | fowl & (fowl | priest & man) ! helicopter ~8 (hillary | tom)
>
> but certainly more complex than I have been able to solve.
>
> We can also take heart in that Verity hasn't quite cracked it either. In
> trying to see exactly what I needed to support I was doing some
> experiments against known data and discovered that Verity has some
> parsing bugs that do not reveal themselves easily. For example:
>
>   title:  "get me out of here, please"
>   queryA: title:((here OR there) NEAR/2 please)
>   queryB: title:(there OR here) NEAR/2 please)
>
> In theory both queries should find the example title, but only queryB
> works with Verity, so something is wrong.
>
> -- Robert
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message