Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 93082 invoked from network); 10 Mar 2010 19:22:21 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 10 Mar 2010 19:22:21 -0000 Received: (qmail 26775 invoked by uid 500); 10 Mar 2010 19:21:49 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 26612 invoked by uid 500); 10 Mar 2010 19:21:49 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 26550 invoked by uid 99); 10 Mar 2010 19:21:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Mar 2010 19:21:49 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Mar 2010 19:21:47 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 35E12234C4C1 for ; Wed, 10 Mar 2010 19:21:27 +0000 (UTC) Message-ID: <362198665.184051268248887219.JavaMail.jira@brutus.apache.org> Date: Wed, 10 Mar 2010 19:21:27 +0000 (UTC) From: "Shai Erera (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-2294) Create IndexWriterConfiguration and store all of IW configuration there In-Reply-To: <136471578.62221267695747130.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843712#action_12843712 ] Shai Erera commented on LUCENE-2294: ------------------------------------ I disagree, but obviously I'm on the minority side. It is clearly documented what's the default Analyzer used is, and that you should change it if you want to get more meaningful analysis. When I created IWC I really wanted to simplify how IW is created. If we force IWC to accept both a Version AND an Analyzer, instantiating IW will look like this: new IndexWriter(dir, new IndexWriterConfig(matchVersion, analyzer)); We don't accomplish anything with it, took away the MFL argument and replace it w/ IWC ... Remember - those that used to set all kind of parameters, using the other ctors, anyway care about how their IW is instantiated. The others that used IW(dir, analyzer, MFL) don't care about all other attributes. MFL is just annoyance, so we removed it. I just don't feel that a default Analyzer, which is Whitespace, is bad. It's easy to understand what your analysis looks like, and since it's well documented, nobody can say "hey, what didn't you warn me". IW defaults to all other bunch of settings, so why is Analyzer different? If we say Analyzer is mandatory, what will stop us tomorrow from saying IndexDeletionPolicy is mandatory? And then we'll get into whole bunch of ctors, only now on IWC? If we're documenting things clearly, and IWC documents clearly all its defaults, I see no reason why to require an Analyzer to be specified up front. At least to me, that will make this entire change useless. When I create my IW for serious indexing, I take care of all its settings. Otherwise I just instantiate it to check something completely not related to its defaults. If I test those, I define them (otherwise I cannot test them). > Create IndexWriterConfiguration and store all of IW configuration there > ----------------------------------------------------------------------- > > Key: LUCENE-2294 > URL: https://issues.apache.org/jira/browse/LUCENE-2294 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Shai Erera > Assignee: Michael McCandless > Fix For: 3.1 > > Attachments: LUCENE-2294.patch, LUCENE-2294.patch, LUCENE-2294.patch > > > I would like to factor out of all IW configuration parameters into a single configuration class, which I propose to name IndexWriterConfiguration (or IndexWriterConfig). I want to store there almost everything besides the Directory, and to reduce all the ctors down to one: IndexWriter(Directory, IndexWriterConfiguration). What I was thinking of storing there are the following parameters: > * All of ctors parameters, except for Directory. > * The different setters where it makes sense. For example I still think infoStream should be set on IW directly. > I'm thinking that IWC should expose everything in a setter/getter methods, and defaults to whatever IW defaults today. Except for Analyzer which will need to be defined in the ctor of IWC and won't have a setter. > I am not sure why MaxFieldLength is required in all IW ctors, yet IW declares a DEFAULT (which is an int and not MaxFieldLength). Do we still think that 10000 should be the default? Why not default to UNLIMITED and otherwise let the application decide what LIMITED means for it? I would like to make MFL optional on IWC and default to something, and I hope that default will be UNLIMITED. We can document that on IWC, so that if anyone chooses to move to the new API, he should be aware of that ... > I plan to deprecate all the ctors and getters/setters and replace them by: > * One ctor as described above > * getIndexWriterConfiguration, or simply getConfig, which can then be queried for the setting of interest. > * About the setters, I think maybe we can just introduce a setConfig method which will override everything that is overridable today, except for Analyzer. So someone could do iw.getConfig().setSomething(); iw.setConfig(newConfig); > ** The setters on IWC can return an IWC to allow chaining set calls ... so the above will turn into iw.setConfig(iw.getConfig().setSomething1().setSomething2()); > BTW, this is needed for Parallel Indexing (see LUCENE-1879), but I think it will greatly simplify IW's API. > I'll start to work on a patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org