couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: CouchDB and Kubernetes
Date Thu, 07 Jul 2016 16:01:48 GMT
Kubernetes 1.3 adds a new concept called “PetSets” (yes, as in “Cattle vs. Pets”) geared
towards our use case. Documentation is here:

https://github.com/kubernetes/kubernetes.github.io/blob/master/docs/user-guide/petset.md <https://github.com/kubernetes/kubernetes.github.io/blob/master/docs/user-guide/petset.md>

Adam

> On May 3, 2016, at 6:09 PM, Adam Kocoloski <kocolosk@apache.org> wrote:
> 
> :)
> 
> 2.0 will maintain a list of which database shards are hosted on which cluster nodes in
the _dbs database. The trouble is that there’s a 1:1 fixed correspondence between an Erlang
node and a CouchDB cluster node; i.e. there’s no way to remap a CouchDB cluster node to
a new Erlang node. In a world where an Erlang node is identified by an IP address controlled
by the cloud service provider or container framework this results in a fairly brittle setup.
If we allow for an abstract notion of a CouchDB cluster node that can be remapped to different
Erlang nodes we can go far here.
> 
> As an aside, there are a ton of subtleties that we’ve uncovered over the years when
it comes to relocating shard files around a cluster. These days CouchDB is smart enough to
know when a file has been moved, what node it was moved from, and what fraction of the anti-entropy
internal replication can be reused to sync up the file in its new location. Pretty interesting
stuff, and it’ll certainly be something we need to keep in mind if we pursue the aforementioned
work. Cheers,
> 
> Adam
> 
>> On May 3, 2016, at 5:17 PM, Michael Fair <michael@daclubhouse.net> wrote:
>> 
>> I think separating the database id, be it a shard id or the entire db,
>> apart from the execution node/context where that database lives, so that
>> the databases themselves can be mobile (or even duplicated) across multiple
>> execution nodes makes perfect architectural sense to me.  Keeping a _peers
>> database which lists which databases are at which nodes makes sense me.
>> 
>> It seems like each "database" being its own thing separate and apart from
>> the node it executes on is a cleaner model all around.
>> 
>> Great idea!
>> 
>> Mike
>> On Apr 29, 2016 7:55 PM, "Adam Kocoloski" <kocolosk@apache.org <mailto:kocolosk@apache.org>>
wrote:
>> 
>>> Hi all,
>>> 
>>> I’ve doing a bit of poking around the container orchestration space lately
>>> and looking at how we might best deploy a CouchDB 2.0 cluster in a
>>> container environment. In general I’ve been pretty impressed with the
>>> design point of the Kubernetes project, and I wanted to see how hard it
>>> would be to put together a proof of concept.
>>> 
>>> As a preamble, I needed to put together a container image for 2.0 that
>>> just runs a single Erlang VM instead of the container-local “dev cluster”.
>>> You can find that work here:
>>> 
>>> https://github.com/klaemo/docker-couchdb/pull/52 <
>>> https://github.com/klaemo/docker-couchdb/pull/52 <https://github.com/klaemo/docker-couchdb/pull/52>>
>>> 
>>> So far, so good - now for Kubernetes itself. My goal was to figure out how
>>> to deploy a collection of “Pods” that could discover one another and
>>> self-assemble into a cluster. Kubernetes differs from the traditional
>>> Docker network model in that every Pod gets an IP address that is routable
>>> from all other Pods in the cluster. As a result there’s no need for some of
>>> the port gymnastics that one might encounter with other Docker environments
>>> - each CouchDB pod can listen on 5984, 4369 and whatever distribution port
>>> you like on its own IP.
>>> 
>>> What you don’t get with Pods is a hostname that’s discoverable from other
>>> Pods in the cluster. A “Service” (a replicated, load-balanced collection
of
>>> Pods) can optionally have a DNS name, but the Pods themselves do not. This
>>> throws a wrench in the most common distributed Erlang setup, where each
>>> node gets a name like “couchdb@FQDN” and the FQDNs are resolvable to IP
>>> addresses via DNS.
>>> 
>>> It is certainly possible to specify an Erlang node name like “
>>> couchdb@12.34.56.78 <mailto:couchdb@12.34.56.78> <mailto:couchdb@12.34.56.78
<mailto:couchdb@12.34.56.78>>”, but we need to be a
>>> bit careful here. CouchDB is currently forcing the Erlang node name to do
>>> “double-duty”; it’s both the way that the nodes in a cluster figure out
how
>>> to route traffic to one another and it’s the identifier for nodes to claim
>>> ownership over individual replicas of database shards in the shard map.
>>> Speaking from experience it’s often quite useful operationally to remap a
>>> given Erlang node name to a new server and have the new server be
>>> automatically populated with the replicas it’s supposed to own. If we use
>>> the Pod IP in Kubernetes for the node name we won’t have that luxury.
>>> 
>>> I think the best path forward here would be to extend the “Node" concept
>>> in a CouchDB cluster so that it has an identifier which is allowed to be
>>> distinct fro the Erlang node name. The “CouchDB Node” is the one that owns
>>> database shard replicas, and it can be remapped to different distributed
>>> Erlang nodes over time via modification of an attribute in the _nodes DB.
>>> 
>>> Hope you all found this useful — I’m quite interested in finding way to
>>> make it easier for users to acquire a highly-available cluster configured
>>> in the “right way”, and I think projects like Kubernetes have a lot of
>>> promise in this regard. Cheers,
>>> 
>>> Adam
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message