lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ziqi Zhang <ziqi.zh...@sheffield.ac.uk>
Subject tokenize into sentences/sentence splitter
Date Wed, 23 Sep 2015 15:18:24 GMT
Hi

I need a special kind of 'token' which is a sentence, so I need a 
tokenizer that splits texts into sentences.

I wonder if there is already such or similar implementations?

If I have to implement it myself, I suppose I need to implement a 
subclass of Tokenizer. Having looked at a few existing implementations, 
it does not look very straightforward how to do it. A few pointers would 
be highly appreciated.

Many thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message