couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Barry Wark <>
Subject Re: The 1.0 Thread
Date Fri, 19 Jun 2009 20:42:21 GMT
On Thu, Jun 18, 2009 at 2:34 PM, Damien Katz<> wrote:
> Okay, time to ask the question, what features do we need to get to 1.0?
> I'm going to list my must haves, and my nice to haves.
> Must have:
> - Document integrity checking: Using some sort of hashing scheme for end to
> end integrity checking of documents and attachments. Reusing the revision ID
> as the hash of the document might work, and has the benefit of allowing
> writing the same changes to 2 different servers and not causing a conflict.
> Also multiple clients can write the same change to a document and not get
> unnecessary conflicts.
> - Reader/Writer access databases and servers: Allow/disallow anonymous,
> users, groups.
> - Continuous replication: Keeping a constant connection and being able to
> replicate changes as soon as they happen.
> - Better testing: We need really some performance and stress testing as part
> of the source. And we need much better code coverage in general with the
> testing.
> Nice to have:
> - Hashing/CRC everything written to disk, data, metadata, index structures,
> etc. But optional, since many filesystems actively integrity-check disk
> data.
> - Better full text integration: Out of the box integration and the ability
> intersect results with views, for easier result formatting. Lucene would be
> the primary FT engine, but we make it pluggable, much like the view engines
> are.

If I may, I would like to put a +1 on ability to intersect (or union)
multiple view results. This is the feature that's preventing
whole-hearted adoption of CouchDB for several applications at my
company. Lucene is close to a solution, but we really need proper
numeric comparisons, not just text comparisons. I don't think we can
commit any resources to making a patch happen on this front until
Jan/Feb. 2010. At that time, we would be willing to help make  this
happen as we're quite excited about CouchDB but are being held up by
the lack of boolean view combinations (the data sets in question are
too large to handle the logic client-side).


> - Attachment level replication: By tracking the revision when an attachment
> was modified, the replicator can avoid copying unchanged attachments to the
> target. The same can apply to json fields, but it's much less of a win
> there.
> - Partitioning/sharding support: Ideally would be nice to have something
> that "just works" without a lot of setup.
> - Built-in authentication: A plug-in that authenticates HTTP users and
> assign them roles. It would use a couch database as a directory that
> contains users documents, etc.
> - Selective replication: The ability to replicate a subset of documents,
> using a javascript function as a selector.
> - Server side doc processing: The ability to POST data and have arbitrary
> server-side processing. The simplest case is posting a document to a Js
> handler that can do some data cleanup and add default values the document
> before saving it. But ideally would be able to interact with the full
> database
> - Scheduled replication: The ability to schedule replication every so often,
> like a cron job. But this can be done with an actual cron job and CURL, so
> it's not critical to have it built-in.
> There are probably a bunch of things I forgot about.
> Respond to this with your must haves and nice to haves. No promises you'll
> get your way (no guarantee for me for that matter), but lets start talking
> about it.
> And anyone who wants to take on any of these issues: mine, yours or anyone
> else's, just do it. Read code, mail dev@ with questions and advice, write
> some code, repeat.
> -Damien

View raw message