Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E5A0B7451 for ; Tue, 18 Oct 2011 22:03:12 +0000 (UTC) Received: (qmail 63324 invoked by uid 500); 18 Oct 2011 22:03:11 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 63270 invoked by uid 500); 18 Oct 2011 22:03:11 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 63263 invoked by uid 99); 18 Oct 2011 22:03:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Oct 2011 22:03:11 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_FROM,HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of aduric@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qw0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Oct 2011 22:03:06 +0000 Received: by qadb14 with SMTP id b14so1247721qad.35 for ; Tue, 18 Oct 2011 15:02:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=iiDC7xDkPAY0cTemxiCWQF4j4OTU3Oh5P3M5u1OM1is=; b=cAVukQ8KktolQj1To7SKZR10aZ+9ZWo6iDXErMGV+LVvUk0ysDCZ8ZX5egUGu7npBL wr6DQIE/CIU3cQatB6NDaD3WFEV+0My93k4CJwDaI8yTZNA5MleKrOGuiYHKTY0JZmNU P5drWRsKuGc5386A0hVPwn1gkSA5sP5CdSwnE= MIME-Version: 1.0 Received: by 10.224.181.76 with SMTP id bx12mr3618530qab.30.1318975365450; Tue, 18 Oct 2011 15:02:45 -0700 (PDT) Received: by 10.224.80.195 with HTTP; Tue, 18 Oct 2011 15:02:45 -0700 (PDT) In-Reply-To: References: Date: Tue, 18 Oct 2011 18:02:45 -0400 Message-ID: Subject: Re: FieldType refactoring? From: Adnan Duric To: dev@lucene.apache.org Content-Type: multipart/alternative; boundary=20cf303b40ed200c7e04af99e27e --20cf303b40ed200c7e04af99e27e Content-Type: text/plain; charset=ISO-8859-1 Hi Mike, Thanks for the response. I was actually thinking about that as well. It seems to me that the IndexableFieldType types only exist within the IndexableField types anyway. We would just like the user to be able to specify a custom IFT but also have a few of them available to the user and wrap them in friendly sounding classes like TextField and StringField, etc... If we have a FieldType behave as you suggested, namely something like: IndexableFieldType ift = new FieldType(FieldType.INDEXED | FieldType.STORED | ...); then the user would be able to create a custom field type for their use. IndexableField ifield = new Field("fieldname", text, ift); I don't really think there are too many problems with this as it is, but since (and correct me if I'm wrong) an IndexableFieldType only exists within the context of an IndexableField, we can force the user to only be able to instantiate an IndexableField through the Field concrete class: IndexableField ifield = new Field("fieldname", text, new StoredFieldType()); where now StoredFieldType only creates FieldTypes: final class StoredFieldType extends FieldType { public StoredFieldType() { super(true, true, true...); } } final class UnstoredFieldType extends FieldType { public UnstoredFieldType() { super(true, false, true...); } } This means that the IndexableField classes such as TextField, NumericField and the like would not be needed at all, and would be replaced by FieldType classes like TextType and NumericType for example; with only one constructor instead of several. Also, at that point, Field must be declared final. Best regards, Adnan On Tue, Oct 18, 2011 at 2:47 PM, Michael McCandless < lucene@mikemccandless.com> wrote: > Hi Adnan, > > I'm glad to hear you are using trunk; this is a good feedback! > > I agree about creating both StoredTextField and TextField (at the edge > of the API): that's much easier to consume than what you now must do > for the stored case (new Field("name", "value", > TextField.STORED_TYPE). I also really don't like how you don't know > if StringField / TextField is stored by default. Having XField and > StoredXField would make that clear. > > So, regardless of how we create FTs/IFTs behind the scenes, I agree we > should have separate sugar classes for StoredXField and XField. > > The expectation is the vast majority of users ("normal" users) will > use the sugar field classes (Stored/StringField, Stored/TextField, > NumericField, BinaryField, etc.) and never have to work with FieldType > classes. > > Then, expert users are able to create arbitrary FieldType instances or > way-expert can just make their own impls of IndexableFieldType (and > IndexableField), ie bypassing Document/Field entirely. > > I agree it'd be nice to have immutable IFT instances, as you suggest, > and then have no FieldType class at all. The challenge is the API for > creating such IFT instances (see the [long] discussion towards the end > of LUCENE-2308)... basically the committers have strong disagreements > on how to instantiate them (builder APIs vs > ctor-taking-tons-of-boolean args APIs vs mutable + freeze). > > That said, I think the int bit flags approach (suggested on > LUCENE-2308 but nobody has followed up on / worked out patch yet) is a > good compromise, since nearly all of these settings are booleans. So > we could have something like: > > new FieldType(INDEXED | STORED) > > So then FieldType would no longer be mutable, and it'd be a concrete > IFT impl that uses bit flags under the hood to record the details for > the type. > > Mike McCandless > > http://blog.mikemccandless.com > > On Tue, Oct 18, 2011 at 11:43 AM, Adnan Duric wrote: > > Hi guys, > > First of all, this is a fantastic project. I'm having a great time > playing > > around with it. > > I downloaded and compiled the trunk and I initially started trying to get > > the examples working but quickly realized that the trunk version was > > different than the previous versions. Anyway, long story short, I managed > to > > get it working. > > However, I'd just like to comment on the functionality where the user > adds > > Field instances to a Document. Each Field has a name, value and a > FieldType. > > I got a little confused when I began to create a FieldType. > > The FieldType class implements an IndexableFieldType, so it's effectively > an > > IndexableFieldType type but it also uses the same type to construct the > > actual FieldType. This confused me because when I created my own > FieldType, > > I had to create a concrete implementation of IndexableFieldType and then > > pass it to FieldType. This also means that there are two ways to create > > IndexableFieldType types: > > IndexableFieldType ift = new FieldType(new MyFieldType()); > > IndexableFieldType ift2 = new MyFieldType(); > > Then I noticed that FieldType just contains mutators to set the various > > members of IndexableFieldType plus a toString() method. After seeing the > > freeze() method, I began to realize that when a concrete FieldType is > > created, some dependent objects have a chance to mutate the FieldType, > and > > then it should be immutable. These objects seem to be the various > instances > > of type IndexableField such as TextField, StringField, NumericField, > etc... > > Upon examination, I realized that when these classes are instantiated, > they > > create a FieldType and mutate some of its fields before freeze()'ing it. > I > > tried to see if I could somehow mutate after freezing but I couldn't find > a > > way to do it. > > I instantiated a TextField and added to a Document: > > IndexableField txtfield = new TextField(name, value); > > doc.add(txtfield); > > However, my test failed since the IndexableFieldType in txtfield was set > to > > TYPE_UNSTORED by default, and I guess I needed a TYPE_STORED. The only > way > > to have a "stored" version of TextField was to create a Field and set it > to > > a "stored" TextField like this: > > doc.add(new Field("fieldname", text, TextField.TYPE_STORED)); > > The STORED versions for each class have to be created explicitly by going > to > > the Field directly. This got me thinking about how to consolidate the two > > FieldTypes. Then I thought about getting rid of FieldType altogether > since > > it's just a mutator for IndexableFieldTypes, and having each concrete > > IndexableField type (TextField, NumericField) constructed by directly > > calling an IndexableFieldType. Since there seems to be "two" main > > FieldTypes, maybe we could have a StoredFieldType and UnstoredFieldType > that > > implement IndexableFieldType: > > final class StoredFieldType implements IndexableFieldType { > > public boolean indexed() { return true; } > > public boolean stored() { return true; } > > public boolean tokenized() { return true; } > > > > . > > . > > . > > > > } > > final class UnstoredFieldType implements IndexableFieldType { ... } > > And then have each IndexableField that is used, itself instantiate one of > > the above IndexableFieldType. For example the TextField(String, String) > > constructot: > > /** Creates a new un-stored TextField */ > > public TextField(String name, String value) { > > > > super(name, value, new UnstoredFieldType()); > > > > } > > This way, the FieldType class is no longer needed, the IndexableField > type > > classes don't need to mutate anything, and, in the end, are instantiated > the > > same way. Of course, for the "stored" TextField, there needs to be a > > StoredTextField class that instantiates a StoredFieldType, but I'm > willing > > to argue that it is better to have this duality appear closer to the > 'edge' > > of the API. Another advantage is that an entire hierarchy can be built > with > > IndexableFieldTypes by using a factory that creates them based on their > > parameters. > > Anyway, I'm probably missing some details and I'd appreciate some > feedback. > > > > Thanks for your time and keep up the great work! > > > > Adnan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: dev-help@lucene.apache.org > > --20cf303b40ed200c7e04af99e27e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Mike,

Thanks for the response.

I was actually thinking about that as well. It seems to me that the Inde= xableFieldType types only exist within the IndexableField types anyway. We = would just like the user to be able to specify a custom IFT but also have a= few of them available to the user and wrap them in friendly sounding class= es like TextField and StringField, etc...

If we have a FieldType behave as you suggested, namely = something like:

IndexableFieldType ift =3D new Fie= ldType(FieldType.INDEXED | FieldType.STORED | ...);

then the user would be able to create a custom field type for their use= .=A0

IndexableField ifield =3D new Field("fie= ldname", text, ift);

I don't really think there are too many prob= lems with this as it is, but since (and correct me if I'm wrong) an Ind= exableFieldType only exists within the context of an IndexableField, we can= force the user to only be able to instantiate an IndexableField through th= e Field concrete class:

IndexableField ifield =3D new Field("fieldname&quo= t;, text, new StoredFieldType());

= where now StoredFieldType only creates FieldTypes:

final class StoredFieldType extends FieldType {
= =A0 =A0 public StoredFieldType() {
<= font class=3D"Apple-style-span" face=3D"'courier new', monospace"><= span class=3D"Apple-tab-span" style=3D"white-space:pre"> super(true,= true, true...);
=A0 =A0 }
}

= final class UnstoredFieldType extends FieldType {
=A0 =A0 public UnstoredFieldType() {
super(true, false, true...);
= =A0 =A0 }
}

This means = that the IndexableField classes such as TextField, NumericField and the lik= e would not be needed at all, and would be replaced by FieldType classes li= ke TextType and NumericType for example; with only one constructor instead = of several. Also, at that point, Field must be declared final.


<= div class=3D"gmail_quote">Best regards,

Adna= n


O= n Tue, Oct 18, 2011 at 2:47 PM, Michael McCandless <lucene@mikemccandless.com>= ; wrote:
Hi Adnan,

I'm glad to hear you are using trunk; this is a good feedback!

I agree about creating both StoredTextField and TextField (at the edge
of the API): that's much easier to consume than what you now must do for the stored case (new Field("name", "value",
TextField.STORED_TYPE). =A0I also really don't like how you don't k= now
if StringField / TextField is stored by default. =A0Having XField and
StoredXField would make that clear.

So, regardless of how we create FTs/IFTs behind the scenes, I agree we
should have separate sugar classes for StoredXField and XField.

The expectation is the vast majority of users ("normal" users) wi= ll
use the sugar field classes (Stored/StringField, Stored/TextField,
NumericField, BinaryField, etc.) and never have to work with FieldType
classes.

Then, expert users are able to create arbitrary FieldType instances or
way-expert can just make their own impls of IndexableFieldType (and
IndexableField), ie bypassing Document/Field entirely.

I agree it'd be nice to have immutable IFT instances, as you suggest, and then have no FieldType class at all. =A0The challenge is the API for creating such IFT instances (see the [long] discussion towards the end
of LUCENE-2308)... basically the committers have strong disagreements
on how to instantiate them (builder APIs vs
ctor-taking-tons-of-boolean args APIs vs mutable + freeze).

That said, I think the int bit flags approach (suggested on
LUCENE-2308 but nobody has followed up on / worked out patch yet) is a
good compromise, since nearly all of these settings are booleans. =A0So
we could have something like:

=A0new FieldType(INDEXED | STORED)

So then FieldType would no longer be mutable, and it'd be a concrete IFT impl that uses bit flags under the hood to record the details for
the type.

Mike McCandless

http://blog.mi= kemccandless.com

On Tue, Oct 18, 2011 at 11:43 AM, Adnan Duric <aduric@gmail.com> wrote:
> Hi guys,
> First of all, this is a fantastic project. I'm having a great time= playing
> around with it.
> I downloaded and compiled the trunk and I initially started trying to = get
> the examples working but quickly realized that the trunk version was > different than the previous versions. Anyway, long story short, I mana= ged to
> get it working.
> However, I'd just like to comment on the functionality where the u= ser adds
> Field instances to a Document. Each Field has a name, value and a Fiel= dType.
> I got a little confused when I began to create a FieldType.
> The FieldType class implements an IndexableFieldType, so it's effe= ctively an
> IndexableFieldType type but it also uses the same type to construct th= e
> actual FieldType. This confused me because when I created my own Field= Type,
> I had to create a concrete implementation of IndexableFieldType and th= en
> pass it to FieldType. This also means that there are two ways to creat= e
> IndexableFieldType types:
> IndexableFieldType ift =3D new FieldType(new MyFieldType());
> IndexableFieldType ift2 =3D new MyFieldType();
> Then I noticed that FieldType just contains mutators to set the variou= s
> members of IndexableFieldType plus a toString() method. After seeing t= he
> freeze() method, I began to realize that when a concrete FieldType is<= br> > created, some dependent objects have a chance to mutate the FieldType,= and
> then it should be immutable. These objects seem to be the various inst= ances
> of type IndexableField such as TextField, StringField, NumericField, e= tc...
> Upon examination, I realized that when these classes are instantiated,= they
> create a FieldType and mutate some of its fields before freeze()'i= ng it. I
> tried to see if I could somehow mutate after freezing but I couldn'= ;t find a
> way to do it.
> I instantiated a TextField and added to a Document:
> IndexableField txtfield =3D new TextField(name, value);
> doc.add(txtfield);
> However, my test failed since the IndexableFieldType in txtfield was s= et to
> TYPE_UNSTORED by default, and I guess I needed a TYPE_STORED. The only= way
> to have a "stored" version of TextField was to create a Fiel= d and set it to
> a "stored" TextField like this:
> doc.add(new Field("fieldname", text, TextField.TYPE_STORED))= ;
> The STORED versions for each class have to be created explicitly by go= ing to
> the Field directly. This got me thinking about how to consolidate the = two
> FieldTypes. Then I thought about getting rid of FieldType altogether s= ince
> it's just a mutator for IndexableFieldTypes, and having each concr= ete
> IndexableField type (TextField, NumericField) constructed by directly<= br> > calling an IndexableFieldType. Since there seems to be "two"= main
> FieldTypes, maybe we could have a StoredFieldType and UnstoredFieldTyp= e that
> implement IndexableFieldType:
> final class StoredFieldType implements IndexableFieldType {
> =A0 =A0 public boolean indexed() { return true; }
> =A0 =A0 public boolean stored() { return true; }
> =A0 =A0 public boolean tokenized() { return true; }
>
> .
> .
> .
>
> }
> final class UnstoredFieldType implements IndexableFieldType { ... } > And then have each IndexableField that is used, itself instantiate one= of
> the above IndexableFieldType. For example the TextField(String, String= )
> constructot:
> /** Creates a new un-stored TextField */
> public TextField(String name, String value) {
>
> super(name, value, new UnstoredFieldType());
>
> }
> This way, the FieldType class is no longer needed, the IndexableField = type
> classes don't need to mutate anything, and, in the end, are instan= tiated the
> same way. Of course, for the "stored" TextField, there needs= to be a
> StoredTextField class that instantiates a StoredFieldType, but I'm= willing
> to argue that it is better to have this duality appear closer to the &= #39;edge'
> of the API. Another advantage is that an entire hierarchy can be built= with
> IndexableFieldTypes by using a factory that creates them based on thei= r
> parameters.
> Anyway, I'm probably missing some details and I'd appreciate s= ome feedback.
>
> Thanks for your time and keep up the great work!
>
> Adnan

---------------------------------------------------------------= ------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


--20cf303b40ed200c7e04af99e27e--