lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Lecarme <math...@garambrogne.net>
Subject handling token created/deleted events in an Index
Date Mon, 16 Jun 2008 14:55:31 GMT
With the LUCENE-1297, the SpellChecker will be able to choose how to  
estimate distance between two words.

Here are some other enhancement:
  * The capacity to synchronize the main Index and the SpellChecker  
Index. Handling tokens creation is easy, a simple TokenFilter can do  
the work. But for Token deletion, it's a bit harder. Lazy deleted can  
be used if each time, token popularity is checked in the main Index.  
It's a pull strategy, a push from the Directory should be lighter.
  * Choosing the similarity strategy. Now, it's only a Ngram  
computation. Homophony can be nice, for example.
  * Spell Index can be used for dynamic similarity without disturbing  
the main Index. By example, Snowball is nice for grouping words from  
its roots, but it disturbs the Index if you wont to make a start with  
query.

Some time ago, I suggested a patch LUCENE-1190, but, I guess it's too  
monolithic. A more modular way should be better.

Any comments or suggestion?

M.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message