lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sven <>
Subject How to get the terms within 5 words of another term?
Date Wed, 12 Nov 2008 19:08:04 GMT
Hi everyone,

I have a term "foo" and I want to count all the occurrences of all the 
terms that are within 5 words of "foo" in all the documents which 
contain "foo".  For simplicity sake, this is only for a single field.  
So if I have 3 documents (each with a single field) that look like this:

Once upon a time, foo lived far, far away in a magical kingdom.

"The Life and Time of the Hero Called Foo" is, by far, the best novel 
about spam I have ever read.

I theorize that over time, foo will gradually move far away from bar.

I would like to generate a list of terms and hits based on their 
proximity to "foo" in all the documents.  So I'll end up with something 

far : 4
time : 3
away : 2

Any help would be greatly appreciated.

Thanks much!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message