lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: SOLR-1131 - Multiple Fields per Field Type
Date Tue, 01 Dec 2009 01:42:46 GMT

It feels like something we've overlooked in this discussion is whether we 
need to worry about any FieldType API changes needed to make these new 
"PolyField" classes aware of when they are multivalued.

The API suggestions grant made gives the FieldTYpe the ability to return a 
Filed[] from a single field value input -- but it doesn't provide any 
information about wether that field value is one of many values we're 
indexing for this field name.

Imagine that i want to make an index of people i know.  Each person also 
has multiple locations where they can frequently be found (home, work, 
gym, girlfriends house, favorite coffee shop, etc..).  My common case is 
to search for people, not locations, so it doesn't make sense to flatten 
out and have a doc for each person+location, i just want a single doc per 
person, but htat means i need a "locations" field that's multivalued.

If i'm using a simple "LatLonFieldType" that splits my comma seperated 
coordinate string into a "locations__LAT" and a "locations__LON" field 
then  iassume it needs to do something special in the multiValued case to 
make sure later "near" searches don't get confused and think that the lat 
from my "work" and the lon from my "home" are actaully a third location.

how do we solve this?

I suppose we could just rely on mathing termPosition information, but that 
means the FieldType needs a way to specify the Analyzer for all of the 
field names it creates on the fly (another argument for reusing 
dynamicFields i guess) to specify matching increments -- but that seems 
somewhat brittle: what about complex PolyFieldTypes that want to create 
variable number of Field's based on the input?

ie: as i recall, if you want to index coordinates of polygon bounding 
boxes using cartisien grid fields, you need more field names for big 
polygons then you do for small polygons -- so what if someone wants a 
multivalued PolyField and indexes very big and very small polygons? ... 
termPositions doens't seem like it really cuts it here.



-Hoss


Mime
View raw message