incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guby <guby.m...@gmail.com>
Subject My CouchDB feature wish number 1: partial updating
Date Mon, 05 May 2008 17:58:48 GMT
Hi guys!

Having schema less documents like in CouchDB opens up for a lot of  
cool things as we all know. You can f.ex store all sorts of related  
data in one document and different documents can also store different  
amounts and types of data.

In theory this is all great, but in reality I have had a lot of  
problems when:

1. I want to do a small change to a document. Then I have to load ALL  
its data (which for big documents make for a huge overhead) so I can  
store back the complete document with its change.
2. When several processes want to perform small updates on the same  
document I get a lot of conflict errors.

In praxis this has led me to store my data in numerous smaller  
documents and store their relationships as parameters holding the ID  
of the parent object.

If partial updating could be implemented it would solve all this! I  
have no idea how hard this would be to implement for you guys, but  
from my side I would like it to work something like this:

We have the following document stored on the server:

{
	_id: "foo",
	revision: "123",
	data: {
		days: [1,2,3,4,5],
		horses: [{
			name:"kaspar",
			races_won: 10
		},
		{
			name:"greg",
			races_won: 0
		}]
	},
	pizzas_eaten: 15
};

We could have two processes working on the document:

Process 1 changes the number of pizzas eaten by sending back the id of  
the document it wants to change and the current revision it is at  
along with the changed data like this:

PUT {
	_id: "foo",
	revision: "123",
	_update: {
		pizzas_eaten: 20	
	}
}

and gets back the new revision number 234

Process 2 which still is at revision 123 can change the values of  
data.days without getting any conflicts by PUTing the following data:

PUT {
	_id: "foo",
	revision: "123",
	_update: {
		data.days: [1,2,3,4,5,6]}
	}
}

and gets back the new revision number 345

Now if Process one tries to update the data.days parameter like this:

PUT {
	_id: "foo",
	revision: "234",
	_update: {
		data.days: [1,2,3,4,5,6,7,8,9,0]}	
	}
}
it will get an conflict error because the data.days value has been  
changed since revision 234 (by the other process. The value of  
data.days is a the newer revision 345).

You could add new parameters as well:

PUT {
	_id: "foo",
	revision: "234",
	_update: {
		pizzas_eaten_on_avarage_a_day: 0.01
	}
}
Updating a value that doesn't exist could add it.

You could also remove/delete values and rearrange documents:

PUT {
	_id: "foo",
	revision: "456",
	_update: {
		pizzas: {
			eaten: 20,
			daily_avarage: 0.01
		}
	}
	_remove: {
		pizzas_eaten_on_avarage_a_day,
		pizzas_eaten
	}
}

The document would now look like this:

{
	_id: "foo",
	revision: "567",
	data: {
		days: [1,2,3,4,5,6],
		horses: [{
			name:"kaspar",
			races_won: 10
		},
		{
			name:"greg",
			races_won: 0
		}]
	},
	pizzas: {
		eaten: 20,
		daily_avarage: 0.01
	}
};


The database server would have to keep track of at what revision the  
different values are at though... that might be cumbersome...

It would greatly improve CouchDB's usability in my case though!

Let me know what you think!

Best regards
Sebastian


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message