lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Madhu Sasidhar, MD" <madhusasid...@gmail.com>
Subject Re: N-gram
Date Mon, 18 Jul 2005 20:56:25 GMT
Rajesh
I am not sure what your eventual goal is - but it looks like you are using 
Lucene is some sort of Natural Language Processing environment - I am doing 
something similar - with dotLucene. Possibly the SpanQuery is what you want 
that will let you specify the Span - hence 1-gram, 2-gram etc. Email me if 
you want samples (C#)
Madhu


On 7/18/05, Rajesh Munavalli <rajeshm@dessci.com> wrote:
> 
> At what point do I add n-grams? Does the order in which I add n-grams
> affect exact phrase queries later? My questions are
> 
> (1) Should I add all the 1-grams followed by 2-grams followed by
> 3-grams..etc sentence by sentence OR
> (2) Add all the 1 grams of entire document first before starting 2-grams
> for the entire document?
> 
> What is the general accepted notion of adding n-grams of a document?
> 
> thanks,
> 
> Rajesh
> 
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message