lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Libbrecht <>
Subject Re: Spliting of words
Date Tue, 13 Sep 2005 10:09:47 GMT

Analyzer is the magic word here.

Lucene's StandardAnalyzer has a whole grammar to split words into 
tokens. There are many more analyzers, most of which are language 
specific (e.g. based the Snowball or Porter-stemmers, see contribs or 
javadoc of core).

For which language do wish to use that ?


Le 13 sept. 05, à 11:45, Madhu Satyanarayana Panitini a écrit :

> Hai all
> I want know the split pattern of text before indexing in Lucene, its
> splits where ever there is space in between the words Or is there any
> pattern in splitting the words of text document. In which program I can
> find the code on the splitting of the word.
> Madhu
> Madhu Satyanarayana. Panitini
> PASS GCA Solution Centre Pvt Ltd.
> 601 Aditya Trade Centre, Ameerpet,
> Hyderabad, India.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message