lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Word co-occurrences counts
Date Thu, 23 Dec 2004 08:35:01 GMT

On Dec 23, 2004, at 1:50 AM, <Andrew.Cunningham@csiro.au> wrote:
> 2.	To be able to return the number of word co-occurrences within
> the document set (ie. How many times does "computer" appear within 50
> words of  "dog")
>
>
>
> Is the second point possible?

SpanNearQuery is your friend!  Like Paul said, this is not currently 
supported by QueryParser, however it is easy to do with the API.

Here's an example with a SpanOrQuery (a SpanNearQuery works 
identically) from the Lucene in Action code SpanQueryTest.java.  Two 
documents are indexed:

         "the quick brown fox jumps over the lazy dog"

         "the quick red fox jumps over the sleepy cat"

This SpanOrQuery is formed (omitting some code details):

     SpanOrQuery or = new SpanOrQuery(new SpanQuery[]{quick, fox});

And the spans are displayed:

spanOr([f:quick, f:fox]):
    the <quick> brown fox jumps over the lazy dog (0.37158427)
    the quick brown <fox> jumps over the lazy dog (0.37158427)
    the <quick> red fox jumps over the sleepy cat (0.37158427)
    the quick red <fox> jumps over the sleepy cat (0.37158427)

Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message