lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ziqi Zhang <>
Subject tokenize into sentences/sentence splitter
Date Wed, 23 Sep 2015 15:18:24 GMT

I need a special kind of 'token' which is a sentence, so I need a 
tokenizer that splits texts into sentences.

I wonder if there is already such or similar implementations?

If I have to implement it myself, I suppose I need to implement a 
subclass of Tokenizer. Having looked at a few existing implementations, 
it does not look very straightforward how to do it. A few pointers would 
be highly appreciated.

Many thanks

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message