lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Li <fancye...@gmail.com>
Subject Re: how to write my own query?
Date Sat, 05 Jun 2010 02:14:16 GMT
it may work. but will it be very slow and I want to score function
like 1/(1+pos)´╝îSpanFirst seems give the save score.

2010/6/5 Erik Hatcher <erik.hatcher@gmail.com>:
> That's why I recommended building a boolean OR'd query out of this.  The
> normal query OR the phrase query OR the span first query.
>
>        Erik
>
>
> On Jun 4, 2010, at 11:52 AM, Li Li wrote:
>
>> thank you. But I don't think SpanFirst query is my need. Because I
>> want to get all documents that contains any term. But give the one
>> whose position is top a boost. The same is term's relative posistions.
>> e.g.
>> doc1         apache lucene is a open source project
>> doc2         apache is a http server and many many other words  ...
>>       lucene ...
>> if user searchs apache lucene, I want both the docs are presented to
>> user. But doc1 gets a higher score. I don't want to use a phrase query
>> because it's slow(compare to boolean query) and set slop to 10000
>> seems strange.
>> e.g.
>>  doc1        some other text   ...                     apache lucene
>> is a open source project
>>  doc2         apache lucene is a open source project some other text
>>
>> SpanFirstQuery is not my need. if user search apache, I want to show
>> both docs but give higher score to doc2 because the matched terms'
>> position less than doc1. If I  use SpanFirstQuery SpanFirstQuery sfq =
>> new SpanFirstQuery(apache, 100); I will fail to find docs which
>> contains apache whose position is larger than 100.
>>
>>
>> 2010/6/4 Erik Hatcher <erik.hatcher@gmail.com>:
>>>
>>> This is perhaps best discussed on the java-user list instead.  Here's
>>> some
>>> thoughts...
>>>
>>> On Jun 4, 2010, at 2:36 AM, Li Li wrote:
>>>
>>>> hi all,
>>>>  I want to implement a query that taking position and terms'
>>>> relative positions into consideration. It only supports multiterm
>>>> queries like boolean or query.
>>>>  But I want to consider term postion and terms relative positions.
>>>>  e.g. there are two docs
>>>>  doc1         apache lucene is a open source project
>>>>  doc2         apache is a http server and lucene ...
>>>>  if user search "apache lucene"  doc1 will win because apache lucene
>>>> appear closer than doc2
>>>
>>> A PhraseQuery will do that.  It's common-place to OR in a (sloppy) phrase
>>> query for the users query in order to get proximity to boost things.  No
>>> custom query needed to accomplish this.
>>>
>>>>  e.g.
>>>>  doc1        some other text apache lucene is a open source project
>>>>  doc2         apache lucene is a open source project some other text
>>>>  doc2 wins because "apache lucene" appear at the first position
>>>
>>> And here, SpanFirstQuery is your friend.  So OR'ing a PhraseQuery and a
>>> SpanFirstQuery (with nested SpanNearQuery, or whatever is appropriate)
>>> seems
>>> to accomplish your goals.
>>>
>>> Give those a try and report back if things still aren't quite what you're
>>> after.
>>>
>>>       Erik
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message