incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kenneth Kalmer <kenneth.kal...@gmail.com>
Subject Re: Couch as a mail store?
Date Fri, 13 Feb 2009 06:20:57 GMT
On Fri, Feb 13, 2009 at 8:01 AM, Brian Candler <B.Candler@pobox.com> wrote:

> > > Of course, SMTP clients don't mind a small delay before they get
> > > their 250
> > > OK at the end of the message; you can therefore write a number of
> > > batches
> > > and do an fsync() every second or so, as long as you remember not to
> > > send
> > > the acknowledgement back to each client until *after* the fsync has
> > > completed.
> >
> > CouchDB does the fsync()-a-second currently.
>
> But I don't think this is currently exposed to the client. You'd need a
> "wait until next fsync has completed" request, to let you know that it's
> safe to send back a 250 OK to the client.
>
> A couple of other thoughts:
>
> (4) E-mail systems create and delete millions of documents per day. Is it
> true that couchdb keeps a small amount of data around indefinitely for
> every
> deleted document, for replication purposes? If so this would keep growing.
> (And whilst I've seen discussion about pruning old _revs, I've not seen
> discussion of pruning deleted documents)


Very good point. Compaction will remove the document data, but not the rev's
(unless I'm mistaken). I'm putting my neck on a block here, just for
comparison, but MySQL's log replay-style replication avoids this issue. I
also want to state that I know it can't be compared reasonably because
couch's replication addresses a lot of other issues. Nevermind, I didn't
write the last sentence...


> (5) I believe the IMAP protocol makes some RDBMS-type demands on a
> mailstore, in particular the allocation of contiguous sequence numbers to
> each message. Some careful thought would be needed on how to do this, as it
> may make full replication difficult - or at least increase the likelihood
> of
> replication conflicts, so would benefit from the planned feature of being
> able to specify custom conflict-resolution logic.


Good point, but how do IMAP servers currently handle this? I don't think
it's the MTA's job to 'tag' the mails specially. Only thing I can think of
is the IMAP servers load the mails ordered by date and manage their own meta
data for things like 'read'. I have to admit I haven't really explored the
innerworkings of the IMAP protocol... I guess one could have extra
attributes in the docs for the IMAP requirements, but those should be
manipulated by the IMAP server directly. From my little work with Ruby's
Net::IMAP library I've seen that the mails have unique id's, which can
probably be the _id attribute?

As for the sequence, doesn't POP clients also request the number of messages
in the store and then pop them one by one (RETR N)? This would effectively
be the same problem for both, and having the results ordered by date and
then POP/IMAP handle the rest of the logic seems quite couch-ish.


> (6) Another planned/not-yet-complete feature which would be very useful for
> a mail store is the document-level access control logic, which could for
> example be used for IMAP ACLs and shared mailboxes.
>
>
ATM I'm mostly fine with this logic staying in the application. Even with my
current MySQL projects do we handle autharization in the applications
itself. But I understand if you want to expose your couch instance directly
to the world these things become an issue.

Ciao


-- 
Kenneth Kalmer
kenneth.kalmer@gmail.com
http://opensourcery.co.za

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message