couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Alfke <j...@mooseyard.com>
Subject Re: Frugal Erlang vs Resources Hungry CouchDB
Date Thu, 30 Jun 2011 16:29:33 GMT

On Jun 29, 2011, at 7:00 PM, sleepnova wrote:

> I think what many people really concerned is the growing pattern of size as
> number of docs increase. (space complexity)
> (If it grows exponentially then that's not a good sign.)

It’s basically linear, assuming the database gets compacted periodically. The file format
is a B-tree, like most other databases, so the extra space for interior nodes is going to
be O(log n). Views, like traditional indexes, also occupy B-tree nodes, so depending on how
many of those you have, they’ll occupy some extra space, but also probably a lot less than
the documents themselves.

It sounds like append-only writing and compaction are confusing to some people. They’re
not really very complicated. If you have some familiarity with garbage collection, CouchDB
works almost exactly like a copying collector[1]: new objects are allocated simply by bumping
a pointer, and collection works by copying the live objects into a new space, then discarding
the old one. By contrast, most other databases work like a regular memory allocator: freeing
obsolete objects in place, keeping a map of free space, and reallocating that space to new
objects later.

—Jens

[1] http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Copying_vs._mark-and-sweep_vs._mark-and-don.27t-sweep
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message