lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Au <bill.w...@gmail.com>
Subject Re: Payloads with Phrase queries
Date Tue, 15 Dec 2009 16:51:54 GMT
Lucene 2.9.1 comes with a PayloadTermQuery:
http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/search/payloads/PayloadTermQuery.html

I have been using that to use the payload as part of the score without any
problem.

Bill


On Tue, Dec 15, 2009 at 6:31 AM, Raghuveer Kancherla <
raghuveer.kancherla@aplopio.com> wrote:

> The interesting thing I am noticing is that the scoring works fine for a
> phrase query like "solr rocks".
> This lead me to look at what query I am using in case of a single term.
> Turns out that I am using PayloadTermQuery taking a cue from solr-1485
> patch.
>
> I changed this to BoostingTermQuery (i read somewhere that this is
> deprecated .. but i was just experimenting) and the scoring seems to work
> as
> expected now for a single term.
>
> Now, the important question is what is the Payload version of a TermQuery?
>
> Regards
> Raghu
>
>
> On Tue, Dec 15, 2009 at 12:45 PM, Raghuveer Kancherla <
> raghuveer.kancherla@aplopio.com> wrote:
>
> > Hi,
> > Thanks everyone for the responses, I am now able to get both phrase
> queries
> > and term queries to use payloads.
> >
> > However the the score value for each document (and consequently, the
> > ordering of documents) are coming out wrong.
> >
> > In the solr output appended below, document 4 has a score higher than the
> > document 2 (look at the debug part). The results section shows a wrong
> score
> > (which is the payload value I am returning from my custom similarity
> class)
> > and the ordering is also wrong because of this. Can someone explain this
> ?
> >
> > My custom query parser is pasted here http://pastebin.com/m9f21565
> >
> > In the similarity class, I return 10.0 if payload is 1 and 20.0 if
> payload
> > is 2. For everything else I return 1.0.
> >
> > {
> >  'responseHeader':{
> >   'status':0,
> >   'QTime':2,
> >   'params':{
> >       'fl':'*,score',
> >       'debugQuery':'on',
> >       'indent':'on',
> >
> >
> >       'start':'0',
> >       'q':'solr',
> >       'qt':'aplopio',
> >       'wt':'python',
> >       'fq':'',
> >       'rows':'10'}},
> >  'response':{'numFound':5,'start':0,'maxScore':20.0,'docs':[
> >
> >
> >       {
> >        'payloadTest':'solr|2 rocks|1',
> >        'id':'2',
> >        'score':20.0},
> >       {
> >        'payloadTest':'solr|2',
> >        'id':'4',
> >        'score':20.0},
> >
> >
> >       {
> >        'payloadTest':'solr|1 rocks|2',
> >        'id':'1',
> >        'score':10.0},
> >       {
> >        'payloadTest':'solr|1 rocks|1',
> >        'id':'3',
> >        'score':10.0},
> >
> >
> >       {
> >        'payloadTest':'solr',
> >        'id':'5',
> >        'score':1.0}]
> >  },
> >  'debug':{
> >   'rawquerystring':'solr',
> >   'querystring':'solr',
> >
> >
> >   'parsedquery':'PayloadTermQuery(payloadTest:solr)',
> >   'parsedquery_toString':'payloadTest:solr',
> >   'explain':{
> >       '2':'\n7.227325 = (MATCH) fieldWeight(payloadTest:solr in 1),
> product of:\n  14.142136 = (MATCH) btq, product of:\n    0.70710677 =
> tf(phraseFreq=0.5)\n    20.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=1)\n',
> >
> >
> >       '4':'\n11.56372 = (MATCH) fieldWeight(payloadTest:solr in 3),
> product of:\n  14.142136 = (MATCH) btq, product of:\n    0.70710677 =
> tf(phraseFreq=0.5)\n    20.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  1.0 = fieldNorm(field=payloadTest, doc=3)\n',
> >
> >
> >       '1':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 0),
> product of:\n  7.071068 = (MATCH) btq, product of:\n    0.70710677 =
> tf(phraseFreq=0.5)\n    10.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=0)\n',
> >
> >
> >       '3':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 2),
> product of:\n  7.071068 = (MATCH) btq, product of:\n    0.70710677 =
> tf(phraseFreq=0.5)\n    10.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=2)\n',
> >
> >
> >       '5':'\n0.578186 = (MATCH) fieldWeight(payloadTest:solr in 4),
> product of:\n  0.70710677 = (MATCH) btq, product of:\n    0.70710677 =
> tf(phraseFreq=0.5)\n    1.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  1.0 = fieldNorm(field=payloadTest, doc=4)\n'},
> >
> >
> >   'QParser':'BoostingTermQParser',
> >   'filter_queries':[''],
> >   'parsed_filter_queries':[],
> >   'timing':{
> >       'time':2.0,
> >       'prepare':{
> >        'time':1.0,
> >
> >
> >        'org.apache.solr.handler.component.QueryComponent':{
> >         'time':1.0},
> >        'org.apache.solr.handler.component.FacetComponent':{
> >         'time':0.0},
> >        'org.apache.solr.handler.component.MoreLikeThisComponent':{
> >
> >
> >         'time':0.0},
> >        'org.apache.solr.handler.component.HighlightComponent':{
> >         'time':0.0},
> >        'org.apache.solr.handler.component.StatsComponent':{
> >         'time':0.0},
> >        'org.apache.solr.handler.component.DebugComponent':{
> >
> >
> >         'time':0.0}},
> >       'process':{
> >        'time':1.0,
> >        'org.apache.solr.handler.component.QueryComponent':{
> >         'time':0.0},
> >        'org.apache.solr.handler.component.FacetComponent':{
> >
> >
> >         'time':0.0},
> >        'org.apache.solr.handler.component.MoreLikeThisComponent':{
> >         'time':0.0},
> >        'org.apache.solr.handler.component.HighlightComponent':{
> >         'time':0.0},
> >
> >
> >        'org.apache.solr.handler.component.StatsComponent':{
> >         'time':0.0},
> >        'org.apache.solr.handler.component.DebugComponent':{
> >         'time':1.0}}}}}
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Dec 10, 2009 at 5:48 PM, AHMET ARSLAN <iorixxx@yahoo.com> wrote:
> >
> >>
> >> > I was looking through some lucene
> >> > source codes and found the following class
> >> > org.apache.lucene.search.payloads.PayloadSpanUtil
> >> >
> >> > There is a function named queryToSpanQuery in this class.
> >> > Is this the
> >> > preferred way to convert a PhraseQuery to
> >> > PayloadNearQuery?
> >>
> >> queryToSpanQuery method does not return PayloadNearQuery type.
> >>
> >> You need to override getFieldQuery(String field, String queryText, int
> >> slop) of SolrQueryParser or QueryParser.
> >>
> >> This code is modified from Lucene In Action Book (2nd edition) Chapter
> >> 6.3.4 Allowing ordered phrase queries
> >>
> >> protected Query getFieldQuery(String field, String queryText, int slop)
> >> throws ParseException {
> >>
> >>        Query orig = super.getFieldQuery(field, queryText, slop);
> >>
> >>        if (!(orig instanceof PhraseQuery)) return orig;
> >>
> >>        PhraseQuery pq = (PhraseQuery) orig;
> >>        Term[] terms = pq.getTerms();
> >>        SpanQuery[] clauses = new SpanQuery[terms.length];
> >>
> >>        for (int i = 0; i < terms.length; i++)
> >>            clauses[i] = new PayloadTermQuery(terms[i], new
> >> AveragePayloadFunction());
> >>        return new PayloadNearQuery(clauses, slop, true);
> >>
> >>    }
> >>
> >>
> >> > Also, are there any performance considerations while using
> >> > a PayloadNearQuery instead of a PhraseQuery?
> >>
> >> I don't think there will be significant performance difference.
> >>
> >>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message