couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: overview of code organization?
Date Fri, 23 Apr 2010 14:46:58 GMT
On Apr 23, 2010, at 10:12 AM, Miles Fidelman wrote:

> Adam,
> 
> Adam Kocoloski wrote:
>> On Apr 23, 2010, at 8:52 AM, Miles Fidelman wrote:
>>   
>>> - notes on the replication process (step-by-step, what happens when replication
is invoked - what code modules are involved and so forth), and/or,
>>>     
>> couch_rep_* modules handle replication.  How familiar are you with Erlang/OTP?  couch_rep_sup
is a supervisor for all replications, each of which has a couch_rep gen_server and changes_feed,
missing_revs, reader, and writer processes.  Each of those processes handles one part of the
"conversation" on the slide I pointed out to you two days ago.  Data flows from changes_feed
->  missing_revs ->  reader ->  writer.
>>   
> 
> Pretty familiar with Erlang at a conceptual/system level; starting to take the time to
get fluent in programming.  Haven't done functional languages in a long time.
> 
>>> - an overview of the code for someone new to the project - what lives in what
modules, how they string together - anything that might shortcut having to read through every
module and make sense of things from scratch
>>> 
>>> Anything - handwritten notes, slides from a code walkthrough, that kind of thing.
>>>     
>> Hi Miles, not to sound critical, but I don't think such a broad request will get
you very far.  If you have specific questions I'll be happy to answer them.
>>   
> With all do respect... lots of projects maintain documentation of internals, particularly
efforts focused on platform technologies intended for long-term and broad-based application.
 Certainly in the world of commercial software development it's the rare project that doesn't
have documentation providing a high level view of a large software system -- it's pretty hard
to either bring new team members on board, or to perform long-term maintenance of code.  Granted
that it's a bit harder to maintain this level of documentation on open-source projects without
steady funding, but I will point at some examples:
> - Linux Kernel Internals: somewhat old (2.4), but http://tldp.org/LDP/lki/index.html
(I know there are updates)
> - Apache HTTPD: http://httpd.apache.org/docs/2.2/developer/
> - MongoDB, documentation of replication internals: http://www.mongodb.org/display/DOCS/Replication+Internals
> - or even http://wiki.github.com/erlang/otp/routemap-source-tree - providing a basic
overview of Erlang's internals
> 
>> Please, take a shot at reading the code for the part you're interested in.  If you
come across something you don't understand, send an email or join #couchdb on IRC.  Many of
the devs hang out there regularly and can walk you through the code.  Best,
>>   
> 
> It doesn't seem that unreasonable to at least ask whether Couch has some similar documentation
floating around - if only at the level of notes put together by an individual developer, or
for discussion among developers.
> 
> Couch is certainly aiming at long-term viability as a platform for broad-based use, and
seems to be aiming at being a broad-based open-source effort.  To succeed over the long term,
it will NEED to have a good set of developer-level documentation.  "Read the code" is not
a a long-term solution.
> 
> Re. replication, in specific, the the couch_rep_* modules do not contain much in the
way of comments.
> 
> Personally, I've been involved in a LOT of network protocol-related work (BBN, back to
the ARPANET days).  I've yet to see any kind of protocol work where someone hasn't jotted
down at least a sequence diagram and some kind of dataflow diagram showing how all the pieces
fit together.  More common is a full-blown ASN.1 description, and eventually an RFC in full
gory detail.
> 
> It does not seem unreasonable to ask if someone has jotted down notes about the full
set of steps executed, and code modules involved, when Couch receives a "POST /_replicate"
transaction.
> 
> At the very least, it sure would be helpful to have something like:
> http://httpd.apache.org/docs/2.2/developer/request.html, or
> http://www.apachetutor.org/dev/request
> to detail the sequence of events and code involved in request processing.
> 
> If, in fact, that kind of information has never been put on "paper," and lives only in
the source code and a few people's heads, that scares me a lot vis-a-vis committing to Couch
as a platform for any kind of serious project.
> 
> Miles Fidelman

Hi Miles, I wasn't calling your request unreasonable, and I wasn't vouching for reading the
code as the optimal source of developer documentation.  But it is what we have right now when
you want to learn about things at module-level granularity.

It terms of broader architectural overviews, you may find Ricky Ho's set of articles useful:

http://horicky.blogspot.com/2008/10/couchdb-implementation.html

Regards, Adam


Mime
View raw message