Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 85261 invoked from network); 31 Jul 2007 19:05:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 31 Jul 2007 19:05:22 -0000 Received: (qmail 57845 invoked by uid 500); 31 Jul 2007 19:05:19 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 57792 invoked by uid 500); 31 Jul 2007 19:05:19 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 57781 invoked by uid 99); 31 Jul 2007 19:05:19 -0000 Received: from Unknown (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 31 Jul 2007 12:05:19 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 31 Jul 2007 19:05:13 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 3B3B97141F8 for ; Tue, 31 Jul 2007 12:04:53 -0700 (PDT) Message-ID: <8132051.1185908693240.JavaMail.jira@brutus> Date: Tue, 31 Jul 2007 12:04:53 -0700 (PDT) From: "Doug Cutting (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-966) A faster JFlex-based replacement for StandardAnalyzer In-Reply-To: <7545374.1185455200156.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516776 ] Doug Cutting commented on LUCENE-966: ------------------------------------- It is important that the same sequence of token text is produced, but I think we could live with different token types in some cases, if we must. Few applications depend on token types, no? Provided the token text issues can be resolved, I'd like to see StandardTokenizer replaced with this. Performance is important, and ideally folks shouldn't have to change their applications to see performance improvements. > A faster JFlex-based replacement for StandardAnalyzer > ----------------------------------------------------- > > Key: LUCENE-966 > URL: https://issues.apache.org/jira/browse/LUCENE-966 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Reporter: Stanislaw Osinski > Fix For: 2.3 > > Attachments: AnalyzerBenchmark.java, jflex-analyzer-patch.txt, jflex-analyzer-r560135-patch.txt, jflex-analyzer-r561292-patch.txt > > > JFlex (http://www.jflex.de/) can be used to generate a faster (up to several times) replacement for StandardAnalyzer. Will add a patch and a simple benchmark code in a while. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org