Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 74547 invoked from network); 14 May 2008 06:08:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 May 2008 06:08:20 -0000 Received: (qmail 10250 invoked by uid 500); 14 May 2008 06:08:18 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 10172 invoked by uid 500); 14 May 2008 06:08:18 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 10152 invoked by uid 99); 14 May 2008 06:08:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 May 2008 23:08:18 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 May 2008 06:07:40 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id AB41A234C117 for ; Tue, 13 May 2008 23:07:55 -0700 (PDT) Message-ID: <1839359503.1210745275700.JavaMail.jira@brutus> Date: Tue, 13 May 2008 23:07:55 -0700 (PDT) From: "Otis Gospodnetic (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1227) NGramTokenizer to handle more than 1024 chars In-Reply-To: <1237878015.1205396626189.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596637#action_12596637 ] Otis Gospodnetic commented on LUCENE-1227: ------------------------------------------ Thanks for the test and for addressing this! Could you add some examples for NO_OPTIMIZE and QUERY_OPTIMIZE? I can't tell from looking at the patch what those are about. Also, note how existing variables use camelCaseLikeThis. It would be good to stick to the same pattern (instead of bufflen, buffpos, etc.), as well as to the existing style (e.g. space between if and open paren, spaces around == and =, etc.) I'll commit as soon as you make these changes, assuming you can make them. Thank you. > NGramTokenizer to handle more than 1024 chars > --------------------------------------------- > > Key: LUCENE-1227 > URL: https://issues.apache.org/jira/browse/LUCENE-1227 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/* > Reporter: Hiroaki Kawai > Assignee: Grant Ingersoll > Priority: Minor > Attachments: LUCENE-1227.patch, NGramTokenizer.patch, NGramTokenizer.patch > > > Current NGramTokenizer can't handle character stream that is longer than 1024. This is too short for non-whitespace-separated languages. > I created a patch for this issues. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org