From user-return-4436-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Thu Apr 16 16:22:30 2009 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 69058 invoked from network); 16 Apr 2009 16:22:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Apr 2009 16:22:30 -0000 Received: (qmail 81006 invoked by uid 500); 16 Apr 2009 16:22:29 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 80928 invoked by uid 500); 16 Apr 2009 16:22:29 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 80918 invoked by uid 99); 16 Apr 2009 16:22:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Apr 2009 16:22:29 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of awolff@gmail.com designates 74.125.92.24 as permitted sender) Received: from [74.125.92.24] (HELO qw-out-2122.google.com) (74.125.92.24) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Apr 2009 16:22:21 +0000 Received: by qw-out-2122.google.com with SMTP id 8so346085qwh.29 for ; Thu, 16 Apr 2009 09:22:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type:content-transfer-encoding; bh=EDtEtNR5LOd3zPg+a2klmjjRez5MyCm1abXK3OL71ks=; b=AZPtLtY5snFcTxscVhbhZyDII0V8seDi8xN7hFfaESiwQvf1M8Ds6iK4uf4olOOnLt hUoZ3g2UCT2M44EY9oIt+A565l8oNVoiBXCbKteJwrOMHrHBQD4krUbUcup+PzRH5DSG xyolJyojhaQ1VvVoGFnfigT+K2ym4UE3LYYcQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=LucgZAGOKNDuwf1RRV0jKKXhQTHpTSH86LzAFp0lYeKwjjXP2BHa35/wgzeDSR62MT xVe2JXqM4Us2QfA00Px6NaVWahJI++5zLC6uFWdrFaQHpv4IIRLWPv8KtkLEFnoV0oAf fngjNGCacjAErcFrwdz8Hou61AtaOj1fzMIIw= MIME-Version: 1.0 Received: by 10.224.80.134 with SMTP id t6mr1989175qak.246.1239898919978; Thu, 16 Apr 2009 09:21:59 -0700 (PDT) Date: Thu, 16 Apr 2009 09:21:59 -0700 Message-ID: Subject: Tracking view updates with CouchDB From: Adam Wolff To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi list, In our app, we are building a notification system. In the end, we need a view sorted by a key and then update time. The update time is a function of a number of different potential app operations -- including updates or commits of documents that refer to the id of the doc that we are keying. So we've got docs that look like this: { type: "key_doc", _id : "xyzzy", key : "doc_key", updated : 1239896906303 } and docs that look like this: { type: "ref_doc", key_doc_id : "xyzzy", updated : 1239897055080 } and we want a view that we query like this: startkey : ["doc_key",null], endkey : ["doc_key",{}] which reduces to the key_doc ids that were updated since the startkey date, in order, e.g. ["xyzzy"] in this case One thing we could obviously do is to update they key_doc every time we write a ref_doc, but we're concerned about contention for the key_doc. We've settled on the idea of writing an extra doc every time we write a ref_doc, which copies the key information from the key_doc, like this: { type: "notify_doc", key_doc_id : "xyzzy", updated : 1239897055080 } Right now, we're POSTing this doc and relying on reduce to sort out the most recent one. One problem with this approach is that we get invalid keys in our view output, because we are copying the keys from the key_doc into the notify_doc. In our app, we end up needing to check that the key in the key_doc still matches the key reported by our view. I'm also a little worried about write/delete churn with this approach. After writing this whole thing, I wonder if the notify business is a premature optimization. We could try just putting the key_doc every time; if the operation fails, *by defintion* there's a more recent update. I'd love any thoughts from the list on this. Another option we have is to punt and handle this all in our app -- the db just reduces a list of key_docs by updated time and the app groups the docs by key. Thanks in advance, Adam