couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Fedyk <mfe...@mikefedyk.com>
Subject How to keep from sending more than one email from multiple replicated couchdb instances
Date Mon, 15 Nov 2010 01:01:20 GMT
node.js + CouchDB == Crazy Delicious by Mikeal Rogers
http://jsconf.eu/2010/speaker/nodejs_couchdb_crazy_delicious.html

I was watching this a couple days ago and I've been thinking about how
to deal with instance and service (think of sending emails as a
"service") failures.  Because it's easy to make sure that only one
email is sent if you only have one server sending emails, but if that
machine fails, then no emails get sent out.

You compose an email while offline and save it to your local couch
instance.  Then later it gets replicated to one of the couchdb
instances in your cloud.  And then:

1. You have the date when it was saved on the phone, etc.  If you had
a timestamp when that replication happened, you'd be able to have a
chain of couchdb instances try to send the email, but only if it is
older than X time after it was replicated to your cloud of couchdb
instances.  instance_a would try immediately, instance_b tries if it
hasn't been taken in X minutes, and so on for instance_c.  see [A].

2. When instance_a wants to send the email, it updates the state to
"taking" and then waits for instance_b and instance_c to ack the
taking by adding fields to the current document.  oops, instance_b and
instance_c will race more often than not and you'll get a conflict so
it needs to be separate temporary state tracking documents.  You still
need [A] or if there are no other instances you'll wait forever for
acks that won't happen.

3. You have one instance that sends emails and you deal with the
downtime if that instance fails or some other failure happens that
prevents email from being sent.

4. You send periodic test emails to make sure they are being sent, and
if they are not then take over the function on instance_$self.  see
[B]

A) And this only works assuming that all of your cloud couchdb
instances are replicating to each other correctly at the moment.  Now
you have N > 1 emails sent out.  (and imagine if what's happening is
something where it's more important than receiving an email or
receiving more than one email)  To keep this from happening you need a
couchdb instance heartbeat (maybe have an app update a document that
describes that instances "registration" in the system with the current
time stamp every 60 seconds) and a STONITH system to kill any
instances of couchdb that stop updating their document.

B) Do you still need [A]?  maybe it's good enough that the email
didn't get back to you, but maybe it is sending emails to other
places.  so it seems [A] is still needed.  Now you also need a service
registration system (make sure this and other services like it are
only running on one instance).

So these are some of the ideas that I'm coming up with on this issue.
I'm looking for more input.  What would you do?

Mime
View raw message