incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Alfke <j...@couchbase.com>
Subject Re: Bigcouch vs couchbase
Date Thu, 27 Mar 2014 14:24:06 GMT

On Mar 26, 2014, at 8:25 PM, Stanley Iriele <siriele2x3@gmail.com> wrote:

> You said couchbase doesn't have MVCC ? All docs say That it
> uses couchDB MVCC append only under the good on a single node…

I’ve read that second line four times and I can’t figure out what the heck it means. “Under
the good”?
Oh wait, you mean “under the hood”? Yes, the nodes use CouchDB databases for persistence.
But the MVCC semantics aren’t exposed. See below.

> Could you elaborate a tad on what you mean by doesn't have MVCC?

MVCC stands for “Multi-Version Concurrency Control”. Wikipedia:

“…each user connected to the database sees a snapshot of the database at a particular
instant in time. Any changes made by a writer will not be seen by other users of the database
until the changes have been completed (or, in database terms: until the transaction has been
committed.)
When an MVCC database needs to update an item of data, it will not overwrite the old data
with new data, but instead mark the old data as obsolete and add the newer version elsewhere.”

Couchbase Server doesn’t store multiple versions of a document*. Nor does it use snapshots
of the database state.

The key-value storage part of Couchbase Server is *NOTHING* like CouchDB. At all. Stop thinking
of them as being related. It inherits from memcached, which is a distributed in-memory cache
engine.

The CouchDB heritage of Couchbase Server is used for two things: (a) _asynchronously_ writing
the documents to a CouchDB b-tree for persistence, and (b) indexing that b-tree with map-reduce
views that can be queried queried almost exactly like CouchDB.

Each node has an in-memory key->value map that’s read and updated by clients. Note: writes
are made directly in RAM. (This is part of the “insane speed” thing.) A parallel task
collects all the updated values and writes them to a CouchDB-compatible b-tree file. Another
parallel task sends the changes to the neighboring nodes so they can keep backups in case
this node goes down.

So when client A does a Put and then client B does a Get, client B gets the value from RAM
that client A wrote there a few microseconds earlier. The database file isn’t involved all.
No MVCC.

This is getting long, but I want to add that in my experience if you try to use Couchbase
Server as though it were CouchDB, you’ll get very frustrated —  viewed that way it feels
primitive and unreliable. You have to treat it as its own thing and accept that you’re making
trade-offs for performance/scalability, and must use different hammers to solve your problems.
(And of course, if you want CouchDB-style semantics, you can add the Couchbase Sync Gateway.
But you won’t get the same performance as raw Couchbase.)

—Jens

* OK, technically the database file used for persistence contains multiple versions of the
document since it’s append-only. But the old versions are never accessed, and there’s
no way to get to them through the API, so they may as well not exist.
Mime
View raw message