couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <>
Subject The 1.0 Thread
Date Thu, 18 Jun 2009 21:34:31 GMT
Okay, time to ask the question, what features do we need to get to 1.0?

I'm going to list my must haves, and my nice to haves.

Must have:
- Document integrity checking: Using some sort of hashing scheme for  
end to end integrity checking of documents and attachments. Reusing  
the revision ID as the hash of the document might work, and has the  
benefit of allowing writing the same changes to 2 different servers  
and not causing a conflict. Also multiple clients can write the same  
change to a document and not get unnecessary conflicts.
- Reader/Writer access databases and servers: Allow/disallow  
anonymous, users, groups.
- Continuous replication: Keeping a constant connection and being able  
to replicate changes as soon as they happen.
- Better testing: We need really some performance and stress testing  
as part of the source. And we need much better code coverage in  
general with the testing.

Nice to have:
- Hashing/CRC everything written to disk, data, metadata, index  
structures, etc. But optional, since many filesystems actively  
integrity-check disk data.
- Better full text integration: Out of the box integration and the  
ability intersect results with views, for easier result formatting.  
Lucene would be the primary FT engine, but we make it pluggable, much  
like the view engines are.
- Attachment level replication: By tracking the revision when an  
attachment was modified, the replicator can avoid copying unchanged  
attachments to the target. The same can apply to json fields, but it's  
much less of a win there.
- Partitioning/sharding support: Ideally would be nice to have  
something that "just works" without a lot of setup.
- Built-in authentication: A plug-in that authenticates HTTP users and  
assign them roles. It would use a couch database as a directory that  
contains users documents, etc.
- Selective replication: The ability to replicate a subset of  
documents, using a javascript function as a selector.
- Server side doc processing: The ability to POST data and have  
arbitrary server-side processing. The simplest case is posting a  
document to a Js handler that can do some data cleanup and add default  
values the document before saving it. But ideally would be able to  
interact with the full database
- Scheduled replication: The ability to schedule replication every so  
often, like a cron job. But this can be done with an actual cron job  
and CURL, so it's not critical to have it built-in.

There are probably a bunch of things I forgot about.

Respond to this with your must haves and nice to haves. No promises  
you'll get your way (no guarantee for me for that matter), but lets  
start talking about it.

And anyone who wants to take on any of these issues: mine, yours or  
anyone else's, just do it. Read code, mail dev@ with questions and  
advice, write some code, repeat.


View raw message