lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ronald Wood <rw...@smarsh.com>
Subject Re: Is it safe to upgrade an existing field to docvalues?
Date Thu, 25 Aug 2016 13:21:22 GMT
Alessandro, yes I can see how this could be conceived of as a more general problem; and yes
useDocValues also strikes me as being unlike the other properties since it would only be used
temporarily.

We’ve actually had to migrate fields from one to another when changing types, along with
awkward naming like ‘fieldName’ (int) to ‘fieldNameLong’. But I’m not sure how a
change like that could actually be done in place.

The point is stronger when it comes to term vectors etc. where data exists in separate files
and switches in code control whether they are used or not.

I guess where I would argue that docValues might be different is that so much new functionality
depends on this that it might be worth treating it differently. Given that docValues now is
on by default, I wonder if it will at some point be mandatory, in which case everyone would
have to migrate to keep up with Solr version. (Of course, I don’t know what the general
thinking is on this amongst the implementers.)

Regardless, this change may be so important to us that we’d choose to branch the code on
GitHub and apply the patch ourselves, use it while we transition, and then deploy an official
build once we’re done. The difference in the level of effort between this approach and the
alternatives would be too great. The risks of using a custom build for production would have
to be weighed carefully, naturally.

- Ronald S. Wood 


On 8/25/16, 06:49, "Alessandro Benedetti" <abenedetti@apache.org> wrote:

    > switching is done in Solr on field.hasDocValues. The code would be amended
    > to (field.hasDocValues && field.useDocValues) throughout.
    >
    
    This is correct. Currently we use DocValues if they are available, and to
    check the availabilty we check the schema attribute.
    This can be problematic in the scenarios you described ( for example half
    the index has docValues for a field and the other half not yet ).
    
    Your proposal is interesting.
    Technically it should work and should allow transparent migration from not
    docValues to docValues.
    But it is a risky one, because we are decreasing the readability a bit (
    althought a user will specify the attribute only in special cases like
    yours) .
    
    The only problem I see is that the same discussion we had for docValues
    actually applies to all other invasive schema changes :
    1) you change the field type
    2) you enable or disable term vectors
    3) you enable/disable term positions,offsets ect ect
    
    So basically this is actually a general problem, that probably would
    require a general re-think .
    So although  can be a quick fix that will work, I fear can open the road to
    messy configuration attributes.
    
    Cheers
    -- 
    --------------------------
    
    Benedetti Alessandro
    Visiting card : http://about.me/alessandro_benedetti
    
    "Tyger, tyger burning bright
    In the forests of the night,
    What immortal hand or eye
    Could frame thy fearful symmetry?"
    
    William Blake - Songs of Experience -1794 England
    


Mime
View raw message