incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <B.Cand...@pobox.com>
Subject Re: Couch as a mail store?
Date Thu, 12 Feb 2009 16:19:18 GMT
On Thu, Feb 12, 2009 at 04:13:45PM +0200, Kenneth Kalmer wrote:
> How would couch fair as a backend for a mail delivery system (in concept)?
> Considering you need high availability and very fast IO. Documents (email
> messages) will be created and deleted very often, some almost
> instantaneously.

I was thinking about this too, and I think it would perform well. You could
for example store headers and IMAP flags in the JSON, and the body as an
attachment (or multiple attachments already MIME-broken down)

I can see a few things which need working through.

(1) One of the key performance issues with real-world SMTP servers is the
need to fsync() the file to disk before sending back an acknowledgement, in
order to guarantee delivery.

I asked about this recently, and it seems that Couch takes an optimistic
view: it writes the file to the OS but doesn't fsync unless you explicitly
ask it. There is a special HTTP header you can provide for this. If you
don't, then your database won't be corrupted if the plug is pulled, but it
may be missing data.

Of course, SMTP clients don't mind a small delay before they get their 250
OK at the end of the message; you can therefore write a number of batches
and do an fsync() every second or so, as long as you remember not to send
the acknowledgement back to each client until *after* the fsync has
completed.

(2) Couch won't let you write a document to disk in chunks; if it doesn't
get an up-front Content-Length: header then it will buffer the whole thing
in RAM.

So if you receive very large E-mail messages, you may wish to buffer them
locally (e.g. in a tempfile on another disk) before sending them as a single
document to Couch.

Note that you sometimes get an indication of the message size in a SMTP
transaction, but it's not guaranteed to be accurate; so you won't know the
true size until you've read it in.

(3) As you say, messages are stored and deleted frequently. You may end up
having to compact your message store frequently, which means basically
reading the whole store from start to end and rewriting it to a new file.
This has to be done when write load isn't too high, to ensure that it will
complete.

Regards,

Brian.

Mime
View raw message