Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@couchdb.apache.org
Received-SPF: neutral (nike.apache.org: 209.85.223.180 is neither permitted
 nor denied by domain of erik@defunweb.com)
MIME-Version: 1.0
In-Reply-To: <20121102125801.376ef132@eee-az>
References: <20121102125801.376ef132@eee-az>
Date: Fri, 2 Nov 2012 10:01:13 -0700
Message-ID: 
 <CA+dr1pvL4CM1KqJeg8t1HJ5iQA2Taug9itU0JwbByeFt7iYP1Q@mail.gmail.com>
Subject: Re: how do u handle "schema" changes ?
From: Erik Pearson <erik@defunweb.com>
To: user@couchdb.apache.org
Content-Type: text/plain; charset=ISO-8859-1

> i know it may sound self-contradictory for couchdb being schemaless ..
>
> but documents that go into it do have structure/schema.
> And once that changes - simplest example being renaming of some field -
> what's the recipe?
> i know it all from the sql land, but this is different.
>
> update the documents ?
> or make all code - both outside and inside couchdb (views etc) -
> accept/handle both old and new?
>
> i guess the answer is "it depends", still, any suggestions?

I think this is it -- there are so many use cases it is very difficult
to generalize. E.g. one area I've struggled with is that large
collections -- millions of documents -- introduce significant time
constraints for rebuilding view indexes. Changing the "schema" of
these documents is not really healthy for a live system. There are
several strategies for dealing with just this one use case, and they
depend on things like data usage patterns. I've also found that there
are significant issues with dependencies between the document
structure and other server or client software. These dependencies
argue for modification off-line, and simultaneous launch of database,
server, and client code.

Still, I find that the flexibility of free-form documents, run-time
availability of a system even under reindexing, and ease of
replication and synchronization make the process of couchdb changes
much more tolerable and maleable than sql relational databases.

I don't have experience with json schema per se, but I have a feeling
that something like it will be important to formalizing the process of
document database integrity and evolution. It does seem a bit contrary
in these early days of document databases to think of tying documents
to scheme definitions (yuck, starts to be all xml-ish), but it does
seem like a natural progression for both ensuring database longevity
as well as other areas like data sharing and archiving.

Cheers,
Erik.

>
> ciao
> svilen