lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Created] (LUCENE-4089) fix or document termsIndexInterval/Divisor for 4.0
Date Wed, 30 May 2012 03:27:23 GMT
Robert Muir created LUCENE-4089:
-----------------------------------

             Summary: fix or document termsIndexInterval/Divisor for 4.0
                 Key: LUCENE-4089
                 URL: https://issues.apache.org/jira/browse/LUCENE-4089
             Project: Lucene - Java
          Issue Type: Bug
          Components: core/index
            Reporter: Robert Muir
             Fix For: 4.0


There are a few parameters on IndexWriterConfig/DirectoryReader that are going to be confusing
unless we do something about it: at least documentation at the minimum:

* IWC.termsIndexInterval: really a codec parameter, actually ignored by 4.0's default impl
(BlockTree)
* IWC.readerDivisor/DirectoryReader.divisor: really two things, if its -1 it means "don't
load terms index", and this is respected by the current impls. Otherwise, it means "sample
the terms index", and this is also actually ignored by 4.0's default impl (BlockTree)

I think people will be confused if they set these things and they do nothing. As far as fixing,
I took a stab at this and its an annoyingly big change. But this is the rough sketch of one
idea i had so far:
* remove interval: its only applicable if you customize codec and select a different terms
index/dict impl anyway, so you can just pass this to FixedGap or whatever yourself.
* divisor: generalize this into something simple like a Map<String,String> of codec
"parameters" that you set on IWC/IR. split divisor from "don't load terms index". define these
as constants where they belong. I got unhappy here in the "splitting" part because I wanted
the divisor part in TermsIndexReaderBase, but that doesnt extend FieldsProducer (where i wanted
the "don't load" part) and wrap the terms dict, instead its backwards and terms dict wraps
the TermsIndexReaderBase... maybe we should fix that too? I think this confusing the way it
is but I didnt look at how difficult this would be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message