Return-Path: X-Original-To: apmail-opennlp-issues-archive@www.apache.org Delivered-To: apmail-opennlp-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1C2AE9181 for ; Sun, 18 Mar 2012 15:31:06 +0000 (UTC) Received: (qmail 68927 invoked by uid 500); 18 Mar 2012 15:31:05 -0000 Delivered-To: apmail-opennlp-issues-archive@opennlp.apache.org Received: (qmail 68902 invoked by uid 500); 18 Mar 2012 15:31:05 -0000 Mailing-List: contact issues-help@opennlp.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@opennlp.apache.org Delivered-To: mailing list issues@opennlp.apache.org Received: (qmail 68894 invoked by uid 500); 18 Mar 2012 15:31:05 -0000 Delivered-To: apmail-incubator-opennlp-issues@incubator.apache.org Received: (qmail 68891 invoked by uid 99); 18 Mar 2012 15:31:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Mar 2012 15:31:05 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Mar 2012 15:31:01 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 04E4D25075 for ; Sun, 18 Mar 2012 15:30:41 +0000 (UTC) Date: Sun, 18 Mar 2012 15:30:41 +0000 (UTC) From: "William Colen (Commented) (JIRA)" To: opennlp-issues@incubator.apache.org Message-ID: <377680511.29782.1332084641032.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <855631284.29759.1332082480426.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (OPENNLP-479) Features related to abbreviation dictionary are not properly collected by DefaultSDContextGenerator MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/OPENNLP-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232286#comment-13232286 ] William Colen commented on OPENNLP-479: --------------------------------------- I changed the DefaultSDContextGenerator assuming that the correct is to have abbreviations with the form "mr.". Please review. > Features related to abbreviation dictionary are not properly collected by DefaultSDContextGenerator > --------------------------------------------------------------------------------------------------- > > Key: OPENNLP-479 > URL: https://issues.apache.org/jira/browse/OPENNLP-479 > Project: OpenNLP > Issue Type: Bug > Components: Sentence Detector > Affects Versions: tools-1.5.3 > Reporter: William Colen > Assignee: William Colen > Fix For: tools-1.5.3 > > > The documentation is not clear about if the entries in abbreviation dictionary should include the EOS character. For example "mr" or "mr.". Also, part of the collector code expects the dictionary to include the EOS character, and others don't. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira