lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Query ReRanking question
Date Sat, 06 Sep 2014 19:07:08 GMT
Ravi:

bq: It is as if the sort is applied after the docs are collected

Exactly, the primary query is getting the top 1,000 documents ranked
by relevance. Then it's sending those through the reranking query,
i.e. sorting them by date. I kind of question whether you really want
1,000 docs to be re-ranked by date, perhaps a smaller number of docs
would provide better results, but that's for you to decide.

If I understand it correctly, conceptually reranking goes like this in
your example
1> execute the first query with rows=1,000, as: q=malaysian airline
crash&rows=1000&fl=id

2> Now form a bit OR clause in a filter query of all the docs returned
in <1>, like
fq=id:(id1 OR id2 OR id45 OR.....) and append it to the reranking query, as:

q=*:*&sort=publish_date desc&fl=headline,publish_date,score&fq=id:(id1
OR id2 OR id45 OR.....)

I'm sure Joel will correct me if I'm wrong here. And of course the
code is much more efficient than this, but that's the idea I think.

Best,
Erick

On Sat, Sep 6, 2014 at 11:33 AM, Ravi Solr <ravisolr@gmail.com> wrote:
> Erick,
>         Your idea about reversing Joel's suggestion seems to give the best
> results of all the options I tried...but I cant seem to understand why. I
> thought the query shown below should give irrelevant results as sorting by
> date would throw relevancy off...but somehow its getting relevant results
> with fair enough reverse chronology. It is as if the sort is applied after
> the docs are collected and reranked (which is what I wanted). One more
> thing that baffled me was, if I change reRankDocs from 1000 to100 the
> results become irrelevant, which doesnt make sense.
>
> So can you kindly explain whats going on in the following query.
>
> http://localhost:8080/solr/select?q=malaysian airline crash&rq={!rerank
> reRankQuery=$rqq reRankDocs=1000}&rqq=*:*&sort=publish_date
> desc&fl=headline,publish_date,score
>
> I love the solr community, so much to learn from so many knowledgeable
> people.
>
> Thanks
>
> Ravi Kiran Bhaskar
>
>
>
> On Fri, Sep 5, 2014 at 1:23 PM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
>> OK, why can't you switch the clauses from Joel's suggestion?
>>
>> Something like:
>> q=Malaysia plane crash&rq={!rerank reRankDocs=1000
>> reRankQuery=$myquery}&myquery=*:*&sort=date+desc
>>
>> (haven't tried this yet, but you get the idea....).
>>
>> Best,
>> Erick
>>
>> On Fri, Sep 5, 2014 at 9:33 AM, Markus Jelsma
>> <markus.jelsma@openindex.io> wrote:
>> > Hi - You can already achieve this by boosting on the document's recency.
>> The result set won't be exactly ordered by date but you will get the most
>> relevant and recent documents on top.
>> >
>> > Markus
>> >
>> > -----Original message-----
>> >> From:Ravi Solr <ravisolr@gmail.com <mailto:ravisolr@gmail.com>
>
>> >> Sent: Friday 5th September 2014 18:06
>> >> To: solr-user@lucene.apache.org <mailto:solr-user@lucene.apache.org>
>> >> Subject: Re: Query ReRanking question
>> >>
>> >> Thank you very much for responding. I want to do exactly the opposite of
>> >> what you said. I want to sort the relevant docs in reverse chronology.
>> If
>> >> you sort by date before hand then the relevancy is lost. So I want to
>> get
>> >> Top N relevant results and then rerank those Top N to achieve relevant
>> >> reverse chronological results.
>> >>
>> >> If you ask Why would I want to do that ??
>> >>
>> >> Lets take a example about Malaysian airline crash. several articles
>> might
>> >> have been published over a period of time. When I search for - malaysia
>> >> airline crash blackbox - I would want to see "relevant" results but
>> would
>> >> also like to see the the recent developments on the top i.e.
>> effectively a
>> >> reverse chronological order within the relevant results, like telling a
>> >> story over a period of time
>> >>
>> >> Hope i am clear. Thanks for your help.
>> >>
>> >> Thanks
>> >>
>> >> Ravi Kiran Bhaskar
>> >>
>> >>
>> >> On Thu, Sep 4, 2014 at 5:08 PM, Joel Bernstein <joelsolr@gmail.com
>> <mailto:joelsolr@gmail.com> > wrote:
>> >>
>> >> > If you want the main query to be sorted by date then the top N docs
>> >> > reranked by a query, that should work. Try something like this:
>> >> >
>> >> > q=foo&sort=date+desc&rq={!rerank reRandDocs=1000
>> >> > reRankQuery=$myquery}&myquery=blah
>> >> >
>> >> >
>> >> > Joel Bernstein
>> >> > Search Engineer at Heliosearch
>> >> >
>> >> >
>> >> > On Thu, Sep 4, 2014 at 4:25 PM, Ravi Solr <ravisolr@gmail.com
>> <mailto:ravisolr@gmail.com> > wrote:
>> >> >
>> >> > > Can the ReRanking API be used to sort within docs retrieved by
a
>> date
>> >> > field
>> >> > > ? Can somebody help me understand how to write such a query ?
>> >> > >
>> >> > > Thanks
>> >> > >
>> >> > > Ravi Kiran Bhaskar
>> >> > >
>> >> >
>> >>
>> >
>>

Mime
View raw message