lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <erik.hatc...@gmail.com>
Subject Re: how to write my own query?
Date Fri, 04 Jun 2010 18:59:38 GMT
That's why I recommended building a boolean OR'd query out of this.   
The normal query OR the phrase query OR the span first query.

	Erik


On Jun 4, 2010, at 11:52 AM, Li Li wrote:

> thank you. But I don't think SpanFirst query is my need. Because I
> want to get all documents that contains any term. But give the one
> whose position is top a boost. The same is term's relative posistions.
> e.g.
> doc1         apache lucene is a open source project
> doc2         apache is a http server and many many other words  ...
>        lucene ...
> if user searchs apache lucene, I want both the docs are presented to
> user. But doc1 gets a higher score. I don't want to use a phrase query
> because it's slow(compare to boolean query) and set slop to 10000
> seems strange.
> e.g.
>  doc1        some other text   ...                     apache lucene
> is a open source project
>  doc2         apache lucene is a open source project some other text
>
> SpanFirstQuery is not my need. if user search apache, I want to show
> both docs but give higher score to doc2 because the matched terms'
> position less than doc1. If I  use SpanFirstQuery SpanFirstQuery sfq =
> new SpanFirstQuery(apache, 100); I will fail to find docs which
> contains apache whose position is larger than 100.
>
>
> 2010/6/4 Erik Hatcher <erik.hatcher@gmail.com>:
>> This is perhaps best discussed on the java-user list instead.   
>> Here's some
>> thoughts...
>>
>> On Jun 4, 2010, at 2:36 AM, Li Li wrote:
>>
>>> hi all,
>>>  I want to implement a query that taking position and terms'
>>> relative positions into consideration. It only supports multiterm
>>> queries like boolean or query.
>>>  But I want to consider term postion and terms relative positions.
>>>  e.g. there are two docs
>>>  doc1         apache lucene is a open source project
>>>  doc2         apache is a http server and lucene ...
>>>  if user search "apache lucene"  doc1 will win because apache lucene
>>> appear closer than doc2
>>
>> A PhraseQuery will do that.  It's common-place to OR in a (sloppy)  
>> phrase
>> query for the users query in order to get proximity to boost  
>> things.  No
>> custom query needed to accomplish this.
>>
>>>  e.g.
>>>  doc1        some other text apache lucene is a open source project
>>>  doc2         apache lucene is a open source project some other text
>>>  doc2 wins because "apache lucene" appear at the first position
>>
>> And here, SpanFirstQuery is your friend.  So OR'ing a PhraseQuery  
>> and a
>> SpanFirstQuery (with nested SpanNearQuery, or whatever is  
>> appropriate) seems
>> to accomplish your goals.
>>
>> Give those a try and report back if things still aren't quite what  
>> you're
>> after.
>>
>>        Erik
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message