opennlp-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim - FooBar();" <jimpil1...@gmail.com>
Subject Re: Asian Sentence Detector Models
Date Wed, 21 Mar 2012 11:05:33 GMT
Basically you need to know the punctuation signs indicating end of 
sentence or find someone who does...then use regex to split the 
sentences at those signs! it's not gonna be perfect - you may have to 
pass it once or twice with your own eyes to make sure everything is ok 
before training. everything depends on the language and how ambiguous 
punctuation it has.

Jim

On 20/03/12 18:38, Jairo Sarabia wrote:
> Hi all,
>
> I see there aren't Sentence Detect Models for Asian languages in openNLP
> repository and I need these ones.
> I've to train Sentence Detect Models for Chinese, Japanese and Korean
> languages, but I don't know these languages.
> How coud I get the data train files for these languages?
>
> Thanks in advance!,
>
> Jairo Sarabia
>


Mime
View raw message