lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Libbrecht <p...@activemath.org>
Subject Re: Spliting of words
Date Tue, 13 Sep 2005 10:09:47 GMT
Madhu,

Analyzer is the magic word here.

Lucene's StandardAnalyzer has a whole grammar to split words into 
tokens. There are many more analyzers, most of which are language 
specific (e.g. based the Snowball or Porter-stemmers, see contribs or 
javadoc of core).

For which language do wish to use that ?

paul


Le 13 sept. 05, à 11:45, Madhu Satyanarayana Panitini a écrit :

> Hai all
>
> I want know the split pattern of text before indexing in Lucene, its
> splits where ever there is space in between the words Or is there any
> pattern in splitting the words of text document. In which program I can
> find the code on the splitting of the word.
>
> Madhu
>
> Madhu Satyanarayana. Panitini
> PASS GCA Solution Centre Pvt Ltd.
> 601 Aditya Trade Centre, Ameerpet,
> Hyderabad, India.
> www.pass-consulting.com
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message