couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: why erlang?
Date Tue, 17 Aug 2010 04:43:23 GMT
On Mon, Aug 16, 2010 at 1:54 PM, Miles Fidelman
<mfidelman@meetinghouse.net> wrote:
> Hi Folks,
>
> I wonder if someone might share some insight into why Erlang was chosen for
> CouchDB.
>
> Don't get me wrong, I think Erlang is a really cool language/environment;
> I'm a big fan of designs that spawn lots of independent processes, and
> communicating via messages.  But... it doesn't seem like CouchDB takes
> advantage of all that much of Erlang's unique capabilities.
>
> Hence, I'm sort of wondering why Erlang for CouchDB, and if there are any
> visions of taking more advantage of Erlang down the road.
>
> Thanks,
>
> Miles Fidelman
>
> --
> In theory, there is no difference between theory and practice.
> In<fnord>  practice, there is.   .... Yogi Berra
>
>
>

Miles,

Firstly I'd like to reemphasize that CouchDB does use Erlang in very
Erlangy ways. There's quite a bit more to the language than just
message passing. Though in the end this thread has seemed to focus on
the use of message passing (or rather, lack thereof) in regards to the
replication protocol.

I can't speak for Damien on why exactly he decided to use HTTP for the
replicator, but I can say that if I were going to design it from
scratch I would probably make very similar choices. Somewhat for
points others have made in that its ubiquitous and does very well with
firewall traversal, but those aren't the main points by a long shot.

The biggest thing that an HTTP replicator has going for it is its
simplicity. The entire protocol can be summed up in as little as "open
an HTTP connection, stream documents edited after the last
replication." Even with that simple idea there's a very large amount
of engineering that has gone into it. We have to take into account
Erlang's memory model, exponential back off when links go wonky,
resumption when they come back, tracking replication histories,
filtered replication, continuous replication, authentication, etc. And
those are just the points I know from listening to the discussion. I
bet Adam Kocoloski and Filipe Manana could go on for hours on the
details I just glossed over.

Switching the replicator to a more advanced protocol I think isn't
really in the cards for the problem that the current replication
scheme is meant to solve. I think that implementing a solution that
uses P2P/UUCP/multicast discovery would be an awesome feature, but not
something I would see going into the 'core' CouchDB distribution until
someone steps up with a long term commitment to supporting it.

Also of interest is that once you get to the 100's or 1000's of nodes
scale you're probably not going to want to use Erlang's native message
passing. Either you're going to be in a datacenter which means you'll
want to fine granted control over network utilization, or you're going
to be distributed in which case epmd/messages will have the usual
firewall/nat issues. Some other interesting points are mentioned in a
recent thread [1] on erlang-questions.

Whether the replicator breaks HTTP is rather more of a philosophical
debate best left for when I've had a few beers. I don't discount your
points that SOAP/XML-RPC suck hard, but I don't think they have any
bearing on the replication protocol given how its implemented.

HTH,
Paul Davis

[1] http://www.erlang.org/cgi-bin/ezmlm-cgi?4:msp:52886:ecobpklllbhjdniiklhn

Mime
View raw message