lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: copyField limitation
Date Fri, 18 Jan 2008 02:10:57 GMT

: But, the <copyField> directive in the schema has a limitation. It will only
: copy data between fields with the same type. If the two fields are a
: different type, the copy is ignored. This example would require <copyField>
: to translate 'sint' to 'integer'. 

i can't reproduce this problem. with the following additions to the 
example schema...

   <field name="popularityI" type="integer" indexed="true" stored="true" default="0"/>
   <copyField source="popularity" dest="popularityI"/>

...i was able to see, sort, and search on the popularityI field with no 

: Another case is days (not times):
: This would express the date as a string 2008-xx-xxT00:00:00Z and store that
: into the day field. It is not as optimal as using '2008-xx-xx' but is still
: useful for wildcards.

I'm not entirely sure i understand wht you are asking ... but i believe 
your point is that there is no easy way to do a copyFiled that reformats 
the data (ie: changing date formats, or converting the date to an int) 

In my opinion, this class of situations isn't a limitation of copyField as 
much as it is a silly restriction in the way FieldTypes are handled by 
IndexSchema ... currently "TextField" is a special case because it's hte 
only FieldType that can have an analyzer (i'm not even sure where this 
special case logic is ... i thought it was when the INdexSchema is 
initialized, but i can't find it now)

It would be nice if any FieldType could have an analyzer, and as long as 
th token(s) produced by that analyzer met the neccessary conditions for 
the data type, things would go on their merry way ... DateReFormatFilter's 
could be used to convert from any arbitray date format to the one Solr 
expects, etc.... you could have have a detailedDate field and <copyField> 
from that to a justDate string field that used a PatternReplaceFilter to 
strip off the time.

This still wouldn't help change the "stored" value of those fields though 
so that the data would look right when retrieving stored values.

Perhaps we should add an optional hook for mutating the "stored" value of 
a fieldtype as well?  ... it could be an Analyzer (ie: 
tokenizer+filterchain) so that we get reuse of existing concepts, with 
each resulting token being treated as a seperate multivalue (for the 
common case of rejoining all the tokens into a single string, we can add a 
StringBufferConcatTokenFilter or something) 



View raw message