opennlp-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William Colen (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OPENNLP-479) Features related to abbreviation dictionary are not properly collected by DefaultSDContextGenerator
Date Sun, 18 Mar 2012 15:30:41 GMT

    [ https://issues.apache.org/jira/browse/OPENNLP-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232286#comment-13232286
] 

William Colen commented on OPENNLP-479:
---------------------------------------

I changed the DefaultSDContextGenerator assuming that the correct is to have abbreviations
with the form "mr.". Please review.
                
> Features related to abbreviation dictionary are not properly collected by DefaultSDContextGenerator
> ---------------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-479
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-479
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Sentence Detector
>    Affects Versions: tools-1.5.3
>            Reporter: William Colen
>            Assignee: William Colen
>             Fix For: tools-1.5.3
>
>
> The documentation is not clear about if the entries in abbreviation dictionary should
include the EOS character. For example "mr" or "mr.". Also, part of the collector code expects
the dictionary to include the EOS character, and others don't.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message