couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <B.Cand...@pobox.com>
Subject Re: Current CouchDB state?
Date Wed, 28 Jan 2009 09:05:20 GMT
On Tue, Jan 27, 2009 at 01:30:37PM -0800, Chris Anderson wrote:
> > - To get such high performance, is it necessary to use _bulk_docs, or was
> >  it achieved with regular PUT operations?
> 
> This was done with a pure Erlang interface, and bulk size of 1k docs.
> The HTTP interface may add minimal overhead if your json is not
> complex and you use bulk_docs

Aha, using a not-yet-published API is cheating :-) But I expect HTTP
bulk_docs will come close.

The trouble here with bulk_docs is that it's an all-or-nothing interface.
For example: suppose I collect RADIUS packets for 100ms before writing them.
I also use the recommended way of creating new docs by using the _uuids call
first, then submit the batch. If there are any uuid conflicts then the whole
batch will fail, and I'll have to patch it up and resubmit the whole lot.

It's not a big problem in this case, but it means it's not practical to
write a generic "HTTP batching proxy" which sits in front of couchdb and
batches arbitrary requests from a mixture of clients.

(Actually, authorization would be a problem anyway, once validate_doc_update
is widely used)

> > - Does Couchdb commit its data to stable storage *before* returning a HTTP
> >  response? That is, once you receive a HTTP success response, you can be
> >  sure that the data has already hit the disk?
> 
> There is a header you can send which forces a full fsync before it
> returns. In the default case, it only returns after writing the file

Looking through the source, I guess you are talking about
X-Couch-Full-Commit: true

However, I don't understand this in the light of what's written at
http://couchdb.apache.org/docs/overview.html

"When CouchDB documents are updated, all data and associated indexes are
flushed to disk and the transactional commit always leaves the database in a
completely consistent state. Commits occur in two steps:

   1. All document data and associated index updates are synchronously
      flushed to disk.
   2. The updated database header is written in two consecutive, identical
      chunks to make up the first 4k of the file, and then synchronously
      flushed to disk."

So it seems we have write - sync - write - sync, and a quick look through
the source shows this in couch_file:write_header/3

What I don't yet understand is:
- at what point is the HTTP response sent to the server?
- at what point is this commit cycle done?
- if multiple clients are PUTting individual documents, does a
  write-sync-write-sync take place for each one?

Again looking at source, I see two very different paths in couch_db.erl
depending on the presence of that magic header:

update_docs(Db, Docs, Options, false) -> ...
update_docs(Db, Docs, Options, true) -> ...

Both send an update_docs message to the UpdatePid process, but I then got a
bit lost in update_docs_int.

> (but trusts the OS -- which usually lies in the interest of speed.)

Do you mean it carries on after write() returns? The OS doesn't lie, it just
says truthfully that it has stuck your data into dirty buffers in the VFS
:-)

> Couch uses Erlang's term_to_binary for saving, which I believe uses
> gzip by default. This is worth verifying, it's been a while since I
> toured that part of the source.

I found it: couch_file:append_term/2 calls

    term_to_binary(Term, [compressed])

The term_to_binary/2 docs say that {compressed,6} is the current default,
i.e. gzip -6

It also mentions that you might want to add {minor_version,1} to allow
floats to be stored as 64-bit doubles rather than in text. This gives both
more accuracy and smaller storage. Otherwise, if json_decode decodes JSON
floats into IEEE, you'll end up converting back to text when it hits the
disk.

Cheers,

Brian.

Mime
View raw message