lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Sentence detection/extraction as Tokenizer?
Date Fri, 27 Nov 2009 18:07:36 GMT
Hello,

The contrib/wordnet package contains an AnalyzerUtil class with a method that extracts sentences
from text/String.  It is super-simplistic, so probably not very accurate, but I am wondering
if *conceptually* it would make sense to have a Tokenizer that extracts sentences?  I suppose
that means each Token would be a complete sentence.

Would you say it makes sense to implement sentence detection/extraction as a Tokenizer?

Thanks,
Otis

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message