Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 21741 invoked from network); 15 Jun 2009 13:38:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Jun 2009 13:38:21 -0000 Received: (qmail 96548 invoked by uid 500); 15 Jun 2009 13:38:31 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 96499 invoked by uid 500); 15 Jun 2009 13:38:31 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 96491 invoked by uid 99); 15 Jun 2009 13:38:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Jun 2009 13:38:31 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Jun 2009 13:38:28 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 92250234C1ED for ; Mon, 15 Jun 2009 06:38:07 -0700 (PDT) Message-ID: <1147142780.1245073087597.JavaMail.jira@brutus> Date: Mon, 15 Jun 2009 06:38:07 -0700 (PDT) From: "Simon Willnauer (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-1688) Deprecating StopAnalyzer ENGLISH_STOP_WORDS - General replacement with an immutable Set In-Reply-To: <667949797.1244811609453.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-1688: ------------------------------------ Attachment: StopWords.patch Attached a patch that marks the ENGLISH_STOP_WORDS as deprecated. I cleaned up in StopAnalyzer (final anyway) a little bit) Added a UnmodifiableCharArraySet impl as an private inner class + testcase > Deprecating StopAnalyzer ENGLISH_STOP_WORDS - General replacement with an immutable Set > --------------------------------------------------------------------------------------- > > Key: LUCENE-1688 > URL: https://issues.apache.org/jira/browse/LUCENE-1688 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Simon Willnauer > Priority: Minor > Fix For: 2.9, 3.0 > > Attachments: StopWords.patch > > > StopAnalyzer and StandartAnalyzer are using the static final array ENGLISH_STOP_WORDS by default in various places. Internally this array is converted into a mutable set which looks kind of weird to me. > I think the way to go is to deprecate all use of the static final array and replace it with an immutable implementation of CharArraySet. Inside an analyzer it does not make sense to have a mutable set anyway and we could prevent set creation each time an analyzer is created. In the case of an immutable set we won't have multithreading issues either. > in essence we get rid of a fair bit of "converting string array to set" code, do not have a PUBLIC static reference to an array (which is mutable) and reduce the overhead of analyzer creation. > let me know what you think and I create a patch for it. > simon -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org