Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 93782 invoked from network); 11 Mar 2010 19:25:23 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 11 Mar 2010 19:25:23 -0000 Received: (qmail 2658 invoked by uid 500); 11 Mar 2010 19:24:49 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 2617 invoked by uid 500); 11 Mar 2010 19:24:49 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 2606 invoked by uid 99); 11 Mar 2010 19:24:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Mar 2010 19:24:49 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Mar 2010 19:24:47 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 3641A234C4BF for ; Thu, 11 Mar 2010 19:24:27 +0000 (UTC) Message-ID: <2146895314.210051268335467221.JavaMail.jira@brutus.apache.org> Date: Thu, 11 Mar 2010 19:24:27 +0000 (UTC) From: "Robert Muir (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-2309) Fully decouple IndexWriter from analyzers In-Reply-To: <1906601074.186321268253387130.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844180#action_12844180 ] Robert Muir commented on LUCENE-2309: ------------------------------------- {quote} Or remove them entirely (but, then, core tests will need to use contrib analyzers for their testing)... {quote} I agree, lets not get caught up on how our tests run from build.xml! We should decouple analysis from IW as much as possible, at least to support more flexible analysis: e.g. someone doesnt want to use the TokenStream concept at all, for example. I don't really have any opinion practically where all the analyzers go, but I do agree it would be nice if they were in one place. For example, in contrib/analyzers now we have analyzers by language, and in most cases, users should really be looking at EnglishAnalyzer as their "default" instead of StandardAnalyzer for English language, as it does Porter stemming, too. > Fully decouple IndexWriter from analyzers > ----------------------------------------- > > Key: LUCENE-2309 > URL: https://issues.apache.org/jira/browse/LUCENE-2309 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael McCandless > > IndexWriter only needs an AttributeSource to do indexing. > Yet, today, it interacts with Field instances, holds a private > analyzers, invokes analyzer.reusableTokenStream, has to deal with a > wide variety (it's not analyzed; it is analyzed but it's a Reader, > String; it's pre-analyzed). > I'd like to have IW only interact with attr sources that already > arrived with the fields. This would be a powerful decoupling -- it > means others are free to make their own attr sources. > They need not even use any of Lucene's analysis impls; eg they can > integrate to other things like [OpenPipeline|http://www.openpipeline.org]. > Or make something completely custom. > LUCENE-2302 is already a big step towards this: it makes IW agnostic > about which attr is "the term", and only requires that it provide a > BytesRef (for flex). > Then I think LUCENE-2308 would get us most of the remaining way -- ie, if the > FieldType knows the analyzer to use, then we could simply create a > getAttrSource() method (say) on it and move all the logic IW has today > onto there. (We'd still need existing IW code for back-compat). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org