lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Roberts <>
Subject Re: n-gram indexing
Date Mon, 18 Jul 2005 22:55:46 GMT
On Monday 18 Jul 2005 21:27, Rajesh Munavalli wrote:
> At what point do I add n-grams? Does the order in which I add n-grams
> affect exact phrase queries later? My questions are
> (1) Should I add all the 1-grams followed by 2-grams followed by
> 3-grams..etc sentence by sentence OR
> (2) Add all the 1 grams of entire document first before starting 2-grams
> for the entire document?
> What is the general accepted notion of adding n-grams of a document?
> thanks,
> Rajesh

I can't see any real advantage of storing n-grams explicitly. Just index the 
document and use phrase queries. Order is significant with phrase queries if 
I recall correctly, although you can use SpanNearQueries to look for 
unordered ngrams, although I don't know why you would want to!

Perhaps if you explain a little more about what you are trying to achieve more 
generally, we can confirm that you don't need to mess with explicit indexing 
of indexing.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message