lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject RE: SOLR-1131 - Multiple Fields per Field Type
Date Tue, 01 Dec 2009 06:58:00 GMT
Hey Hoss,

> So rather then try to make it entirely magical and behind the scnes, and
> still require them to know about it if a collision happens and they get an
> error, let's put it right out in front of them so they know about it and
> think it through.

+1 to that -- was never trying to make anything magical, just to point out that there were
a number of different solutions here, not all of which are orthogonal (as you pointed out
above, SOLR may use a combination of intuitive log messages + explicit collision handling
in code, not just one or the other).

> if people feel that something like this...

>   <fieldType name="latlon" type="LatLonFieldType" />
>   <dynamicField name="location*" type=latlon" />
>
> ...where an end user can deal with these fields...
> 
>    location
>    location_home
>    location_work
>
> ...and under the covers the field type uses...
>
>    location__LAT + location__LON
>    location_home__LAT + location_home__LON
>    location_work__LAT + location_work__LON
>
> ...is an abuse of the <dynamicField/> syntax, then we could accomplish the
> same thing with something like...
> 
>  <fieldType name="latlon" type="LatLonFieldType" />
>  <field name="location" type=latlon" pattern="location__*" />
>  <field name="location_home" type=latlon" pattern="location_home__*" />
>  <field name="location_work" type=latlon" pattern="location_work__*" />

Now you're talking. I like this option, with the following updates:

<fieldType name="latlon" type="LatLonFieldType" pattern="location__*" />
<fieldType name="latlon_home" type="LatLonFieldType" pattern="location_home_*"/>
<fieldType name="latlon_work" type="LatLonFieldType" pattern="location_home_*"/>

<field name="location" type=latlon"/>
<field name="location_home" type=latlon_home"/>
<field name="location_work" type=latlon_work"/>

I think it makes more sense to define the heterogeneity at the fieldType level because:

(a) it's a bit more consistent with the existing solr schema examples, where the difference
between many of the field types (e.g., ints and tints, which are both solr.TrieIntField's,
date and tdate, both instances of solr.TrieDateField, with different configuration, etc.)

(b) isolation of change: <fieldType> defs will change less often than <field>
defs, where names and indexed/stored/etc. debugging are likely to occur more frequently

>...but that would be more verbose, and would be somewhat confusing to try
>and use as a true dynamicField (ie: we want to support "home", "work" and
>anything else picked at run time)...
>
>  <fieldType name="latlon" type="LatLonFieldType" />
>  <field name="location" type=latlon" pattern="location*" />
>  <dynamicfield name="location_*" type=latlon" pattern="??whagoeshere??" />

I don't think the above hybrid approach will lead to anything other than confusion, as you
indicated above. Let's stick to the pattern defs at the <fieldType> level, and then
let the fieldType handle the internal "dynamicity" with e.g., a dynamicField, and then notify
the schema user by providing: (1) a nice intuitive set of documentation with the poly field
types that says: don't use these reserved field names in your schema if you are using this
field type in any of your field instances (the concept is the same as in P/L's -- you can
declare variables named "for" or "int", etc.); and (2) intuitive error msgs and exceptions
if the schema user insists on ignoring the poly field documentation.

>so why not just leverage the existing dynamicFieldsyntax/mechanism where
>schema creators already expect fields to be created at runtime, and
>already have to think about possible name collisions?

I think we should leverage dynamicFields, but maybe not explicitly. Then you have to maintain
the poly field def as both a dynamicField and fieldType, which IMHO is not as elegant as multiple
field type def (configured instances of the same field type) with the pattern param you suggested,
coupled with field declarations that use those fieldType configured instances.

Cheers,
Chris


Mime
View raw message