lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Spencer Tickner" <spencertick...@gmail.com>
Subject Re: number of hits per document
Date Tue, 10 Jun 2008 20:42:25 GMT
Hi John,

Sorry I don't have a solution for you but I'm trying to do the same
thing. I would love to hear from you if you have any success with
this.

Cheers,

Spencer
spencertickner@gmail.com

On Tue, Jun 10, 2008 at 6:28 AM, John Byrne <john.byrne@propylon.com> wrote:
> Hi,
>
> I could do it that way, but couting the spans per document is specific to
> SpanQuerys. I would still have to count hits for TermQuerys separately. I
> was looking for a generic way to count hits for any instance of Query within
> a document.
>
> To put it another way, the ability to find the Term frequency in a single
> document seems incomplete, since a Term does not equate to a hit. For
> instance, sticking with my previous example, if my document contained a
> thousand occurrences of "cats" but only one of them is near "dogs", then the
> frequency of the Term "cats" in that document is irrelevant to me.
>
> In general, my queries will consist of a BooleanQuery containing any number
> of sub-queries of any implementation - what I actually need to know is how
> many hits there are for that BooleanQuery query in each document. Maybe I
> will expand the BooleanQuery into all it's sub-queries recursively, and then
> handle them by type - counting spans per document for SpanQuerys and using
> the Term frequency for TermQuerys. I was just hoping there would be an
> existing (and fast)  way to do this.
>
> Thanks,
> John
>
> Grant Ingersoll wrote:
>>
>> A SpanQuery is just a Query, so the traditional way of Querying still
>> applies, i.e. you get back a list of matching documents.  Beyond that, if
>> you just want to operate on the spans, just keep track of how often the
>> doc() method changes.
>>
>> HTH,
>> Grant
>> On Jun 9, 2008, at 11:21 AM, John Byrne wrote:
>>
>>> Hi,
>>>
>>> Is there an easy way to find out the number of hits per document for a
>>> Query, rather than just for a Term?
>>>
>>> Let's say, for example, I have a document like this:
>>>
>>> "here is cats near dogs and here is cats a long long way from dogs"
>>>
>>> and I use a SpanNearQuery to find "cats" near "dogs" with a slop of 1 - I
>>> need to be able to find out that there was 1 hit, even though there are 2
>>> occurrences of "cats" and 2 of "dogs" - there is still only 1 hit that
>>> matches my Query.
>>>
>>> Is this possible?
>>>
>>> Thanks,
>>> JB.
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com
>>
>> Lucene Helpful Hints:
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message