lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <>
Subject [jira] [Commented] (LUCENE-2308) Separately specify a field's type
Date Wed, 31 Aug 2011 07:45:10 GMT


Simon Willnauer commented on LUCENE-2308:

Hey guys, why don't we put plain old immutable java objects with a single ctor into core and
add a builder API into modules / sandbox? This keeps things simple in core and if users want
to use it they can grab it out of a module? 

bq. Can we avoid the builder API? I think we shouldnt invite accidental creation of lots of
FieldType instances during indexing... why not just a single ctor in fieldtype that takes
all the parameters the base class cares about? then it serves double-duty as the 'expert'
fieldtype anyway, subclasses like TextField are just the sugar.

so I haven't seen a single technical argument against a builder here. I personally think that
a builder has many advantages:

* simple to add new fields, doesn't need deprecation if you add another field to a type
* simple to use since lots of people are use to chaining
* provides immutability by design
* represents a small but clear DSL to build a field type. you could do things like providing
setters for TV only if you chain it with a call to indexed() like: {code} builder.indexed().storeTV();
{code} which would not be visible otherwise. 
* a ctor call will require many parameters that you don't want to set, but you're forced to
pass a value for them anyway
* since most of the parameters are booleans long sequences of identically typed parameters
can cause subtle bugs. If the user accidentally reverses two such parameters, the compiler
won't complain, but the program will misbehave at runtime. That sucks! especially if you spend
hours of indexing and realize that your TV has not been stored because you missed to set indexed
= true
* builder code is easy to write and, more importantly, to read.
* a builder simulates named optional parameters like in python and other languages which java
is lacking.

I think the Builder pattern is a good choice when designing classes whose constructors would
have more than a handful of parameters, especially if most of those parameters are optional.
Client code is much easier to read and write with builders than with the traditional telescoping
constructor pattern.

> Separately specify a field's type
> ---------------------------------
>                 Key: LUCENE-2308
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>             Fix For: 4.0
>         Attachments: LUCENE-2308-10.patch, LUCENE-2308-11.patch, LUCENE-2308-12.patch,
LUCENE-2308-13.patch, LUCENE-2308-14.patch, LUCENE-2308-15.patch, LUCENE-2308-16.patch, LUCENE-2308-17.patch,
LUCENE-2308-18.patch, LUCENE-2308-19.patch, LUCENE-2308-2.patch, LUCENE-2308-20.patch, LUCENE-2308-21.patch,
LUCENE-2308-3.patch, LUCENE-2308-4.patch, LUCENE-2308-5.patch, LUCENE-2308-6.patch, LUCENE-2308-7.patch,
LUCENE-2308-8.patch, LUCENE-2308-9.patch, LUCENE-2308-branch.patch, LUCENE-2308-final.patch,
LUCENE-2308-ltc.patch, LUCENE-2308-merge-1.patch, LUCENE-2308-merge-2.patch, LUCENE-2308-merge-3.patch,
LUCENE-2308.branchdiffs, LUCENE-2308.branchdiffs.moved, LUCENE-2308.patch, LUCENE-2308.patch,
LUCENE-2308.patch, LUCENE-2308.patch, LUCENE-2308.patch
> This came up from dicussions on IRC.  I'm summarizing here...
> Today when you make a Field to add to a document you can set things
> index or not, stored or not, analyzed or not, details like omitTfAP,
> omitNorms, index term vectors (separately controlling
> offsets/positions), etc.
> I think we should factor these out into a new class (FieldType?).
> Then you could re-use this FieldType instance across multiple fields.
> The Field instance would still hold the actual value.
> We could then do per-field analyzers by adding a setAnalyzer on the
> FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise
> for per-field codecs (with flex), where we now have
> PerFieldCodecWrapper).
> This would NOT be a schema!  It's just refactoring what we already
> specify today.  EG it's not serialized into the index.
> This has been discussed before, and I know Michael Busch opened a more
> ambitious (I think?) issue.  I think this is a good first baby step.  We could
> consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold
> off on that for starters...

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message