lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: loading many documents by ID
Date Fri, 02 Feb 2007 06:10:22 GMT

: 1.  Set the "updateable" fields explicitly in the schema.
: <field name="name" type="text" updateable="true" indexed="true" stored="true"/>
:
: * throw an exception at startup if an updateable field is not stored.
: If somewhere down the road we figure out how to efficiently handled
: unstored fields, we can remove this error.
: * when 'updating', only copy the fields marked 'updateable'
: * If someone sends an 'update' request and there are no fields marked
: updateable, return an error

i have two concerns:

1) regardless of the verb (updatable/modifiable) i'm not sure that it
makes sense to annotate in the schema the fields that should be copied on
update, and not label the feilds that must be "set" on update (ie: the
fields that cannot be copied)

2) Solr makes it very easy to support different "classes" of documents
that use differnet subsets of hte fields in the schema -- some of which
may overlap.  if we assume that it's okay to allow an "update" of a
document because there's at least one field in the schema that is stored,
we won't catch cases where that one field isn't used for that "type" of
document.

a simple way to go that wouldn't catch all user mistakes, but could be
confident it never errored incorrectly would be to assume that any doc can
be "updated" as long as it has at least one stred field -- that's the
simplest possible use case afterall, that i want to modify a doc in place,
replacing all of the index but unstored values with new values, and i only
want the stored fields to be copied over again unchanged.

another simple approach would be to make "updatability" a property of the
schema, that can contain a few different values...
 "strict" - indexed and stored are no longer valid field(type)
            attributes -- all fields are indexed and stored. all fields
            are copied on "update" unless the update command inlcudes
            instructions to replace, append or incriment the field value
  "loose" - indexed/stored still exist, any attempt to "update" an
            existing document is legal, all stored fields are copied
            on update unless the update command includes in structures
            to replace, append or increment the field value.
   "none" - any attempt to update will fail.

...novice users who want updatability should use strict, more experienced
users who want updatability but smaller index sizes and understand the
issues with fields that are indexed but unstored can use loose.

another approach i don't really have fully fleshed out in my head would be
to introduce a concept of "fieldsets" ... an update that
sets/appends/incrments a field in a fieldset which does not provide a
value for any unstored fields in that fieldset could trigger an error ...
thta would help with the differnet 'classes' of documents, but i'm not
sure if it could relaly work with dynamicFields.



-Hoss


Mime
View raw message