Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 44EFF9AFC for ; Tue, 6 Mar 2012 22:14:20 +0000 (UTC) Received: (qmail 87154 invoked by uid 500); 6 Mar 2012 22:14:19 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 87045 invoked by uid 500); 6 Mar 2012 22:14:19 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 87024 invoked by uid 99); 6 Mar 2012 22:14:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 22:14:19 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 22:14:18 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 4770CC60D for ; Tue, 6 Mar 2012 22:13:58 +0000 (UTC) Date: Tue, 6 Mar 2012 22:13:58 +0000 (UTC) From: "Thomas Neidhart (Updated) (JIRA)" To: issues@commons.apache.org Message-ID: <691541489.29955.1331072038294.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <979855655.81977.1327599038395.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (CODEC-132) BeiderMorseEncoder OOM issues MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CODEC-132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Neidhart updated CODEC-132: ---------------------------------- Attachment: CODEC-132.patch Hi, please find attached a patch for the outlined solution: addind a maximum phoneme parameter to the engine that limits the number of phonemes processed / returned. By now, I have assumed a default of 20, if the user does not provide a value himself. Would like to hear some feedback from the original author on that. > BeiderMorseEncoder OOM issues > ----------------------------- > > Key: CODEC-132 > URL: https://issues.apache.org/jira/browse/CODEC-132 > Project: Commons Codec > Issue Type: Bug > Affects Versions: 1.6 > Reporter: Robert Muir > Attachments: CODEC-132.patch, CODEC-132_test.patch > > > In Lucene/Solr, we integrated this encoder into the latest release. > Our tests use a variety of random strings, and we have recent jenkins failures > from some input streams (of length <= 10), using huge amounts of memory (e.g. > 64MB), > resulting in OOM. > I've created a test case (length is 30 here) that will OOM with -Xmx256M. > I haven't dug into this much as to what's causing it, but I suspect there might be a bug > revolving around certain punctuation characters: we didn't see this happening until > we beefed up our random string generation to start producing "html-like" strings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira