lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Susmit <shukla.sus...@gmail.com>
Subject Re: [External] Re: per-field count of documents matched?
Date Thu, 13 Feb 2020 04:24:44 GMT
i used json facet api for a similar requirement. it can ignore filters from main query if needed
and roll up the hit counts to any field ..


> On Feb 11, 2020, at 6:19 PM, Fischer, Stephen <sfischer@pennmedicine.upenn.edu>
wrote:
> 
> Thanks very much!   By the way, we are using eDisMax, and the queries our UI supports
don't include fancy Booleans, so your ideas just might work
> 
> Thanks again,
> Steve
> 
> -----Original Message-----
> From: Erick Erickson <erickerickson@gmail.com> 
> Sent: Tuesday, February 11, 2020 7:16 PM
> To: solr-user@lucene.apache.org
> Subject: [External] Re: per-field count of documents matched?
> 
> Hmmm, you could do a facet query (or a series of them). facet.query=LastName:stone&facet.query=Street:stone
etc….. That’d automatically only tally for the docs that match.
> 
> You could also consider a custom search component. For the exact case you describe, it’s
actually fairly simple. The postings list has, for each term, the list of docs that contain
it (internal Lucene doc ID). So I might have for field LastName:
> stone -> 1,73,100…
> 
> for field Street:
> stone-> 264,933…
> 
> So it’s simply a matter of, for each term, and each doc the overall query matches go
down the list of docs and add them up.
> 
> However… I’m not sure you’d get what you want in either case. Consider a query
(A AND B) OR (C AND D). And let’s say doc1 contains A in LastName, and C,D in Street. Should
A be counted in the LastName tally for this doc?
> 
> I suppose you could put the full query in the facet.query above. I’m still not sure
it’s what you need, since I’m not sure what "per-field count of documents that match”
means in your application…
> 
> Best,
> Erick
> 
>> On Feb 11, 2020, at 6:15 PM, Fischer, Stephen <sfischer@pennmedicine.upenn.edu>
wrote:
>> 
>> Hi wise Solr experts,
>> 
>> For our scientific use-case we want to show users a per-field count of documents
that match that field.
>> 
>> We like to do this efficiently because we might return up to a million documents.
>> 
>> For example, if we had documents describing People, and a query of, 
>> say, "Stone" we might want to show
>> 
>> Fields matched:
>> Last name:  145
>> Street: 431
>> Favorite rock band:  13
>> Home exterior: 2340
>> 
>> Is there an efficient way to do this?
>> 
>> So far, we're trying to leverage highlighting.   But it seems very slow.
>> 
>> Thanks
> 

Mime
View raw message