lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shalin Shekhar Mangar" <shalinman...@gmail.com>
Subject Re: How to search
Date Mon, 25 Aug 2008 14:43:00 GMT
On Mon, Aug 25, 2008 at 5:37 PM, Karl Wettin <karl.wettin@gmail.com> wrote:

>
> Is this the specific use case, that you want to handle composite words as
> in javaFieldAndClassNames? There is no native support for that in Lucene to
> my knowledge, but it should not be too hard to implement a TokenStream that
> tokenize such composite words in to single tokens. You probably want to keep
> the original token too though.
>
> Another alternative is creating an ngram index.
>
> Finally you might want to look at the org.apache.lucene.analysis.compound
> package in contrib/analyzers.
>
>
Solr has WordDelimiterFilter which splits on case transition (and many
more). It is exposed through WordDelimiterFilterFactory.

http://lucene.apache.org/solr/api/org/apache/solr/analysis/WordDelimiterFilterFactory.html
http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/analysis/WordDelimiterFilter.java?revision=684908&view=markup

-- 
Regards,
Shalin Shekhar Mangar.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message