lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dipesh <dipshres...@gmail.com>
Subject Re: How to get the terms within 5 words of another term?
Date Thu, 13 Nov 2008 02:02:34 GMT
You might want to look at the TermPositionVector. For it to work I think the
TermVector themselves have to be stored with option  TermVector.YES

regards,
Dipesh



On Thu, Nov 13, 2008 at 4:26 AM, Sven <sven.carlberg@gmail.com> wrote:

> Hi everyone,
>
> I have a term "foo" and I want to count all the occurrences of all the
> terms that are within 5 words of "foo" in all the documents which
> contain "foo".  For simplicity sake, this is only for a single field.
> So if I have 3 documents (each with a single field) that look like this:
>
> Once upon a time, foo lived far, far away in a magical kingdom.
>
> "The Life and Time of the Hero Called Foo" is, by far, the best novel
> about spam I have ever read.
>
> I theorize that over time, foo will gradually move far away from bar.
>
> I would like to generate a list of terms and hits based on their
> proximity to "foo" in all the documents.  So I'll end up with something
> like:
>
> far : 4
> time : 3
> away : 2
>
> Any help would be greatly appreciated.
>
> Thanks much!
> -Sven
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
----------------------------------------
"Help Ever Hurt Never"- Baba

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message