lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Byrne <john.by...@propylon.com>
Subject Re: number of hits per document
Date Tue, 10 Jun 2008 13:28:52 GMT
Hi,
 
I could do it that way, but couting the spans per document is specific 
to SpanQuerys. I would still have to count hits for TermQuerys 
separately. I was looking for a generic way to count hits for any 
instance of Query within a document.

To put it another way, the ability to find the Term frequency in a 
single document seems incomplete, since a Term does not equate to a hit. 
For instance, sticking with my previous example, if my document 
contained a thousand occurrences of "cats" but only one of them is near 
"dogs", then the frequency of the Term "cats" in that document is 
irrelevant to me.

In general, my queries will consist of a BooleanQuery containing any 
number of sub-queries of any implementation - what I actually need to 
know is how many hits there are for that BooleanQuery query in each 
document. Maybe I will expand the BooleanQuery into all it's sub-queries 
recursively, and then handle them by type - counting spans per document 
for SpanQuerys and using the Term frequency for TermQuerys. I was just 
hoping there would be an existing (and fast)  way to do this.

Thanks,
John

Grant Ingersoll wrote:
> A SpanQuery is just a Query, so the traditional way of Querying still 
> applies, i.e. you get back a list of matching documents.  Beyond that, 
> if you just want to operate on the spans, just keep track of how often 
> the doc() method changes.
>
> HTH,
> Grant
> On Jun 9, 2008, at 11:21 AM, John Byrne wrote:
>
>> Hi,
>>
>> Is there an easy way to find out the number of hits per document for 
>> a Query, rather than just for a Term?
>>
>> Let's say, for example, I have a document like this:
>>
>> "here is cats near dogs and here is cats a long long way from dogs"
>>
>> and I use a SpanNearQuery to find "cats" near "dogs" with a slop of 1 
>> - I need to be able to find out that there was 1 hit, even though 
>> there are 2 occurrences of "cats" and 2 of "dogs" - there is still 
>> only 1 hit that matches my Query.
>>
>> Is this possible?
>>
>> Thanks,
>> JB.
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message