From dev-return-1513-apmail-couchdb-dev-archive=couchdb.apache.org@couchdb.apache.org Mon Dec 22 04:15:27 2008 Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 69236 invoked from network); 22 Dec 2008 04:15:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Dec 2008 04:15:26 -0000 Received: (qmail 34951 invoked by uid 500); 22 Dec 2008 04:15:26 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 34906 invoked by uid 500); 22 Dec 2008 04:15:25 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 34895 invoked by uid 500); 22 Dec 2008 04:15:25 -0000 Delivered-To: apmail-incubator-couchdb-dev@incubator.apache.org Received: (qmail 34892 invoked by uid 99); 22 Dec 2008 04:15:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Dec 2008 20:15:25 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of antony.blakey@gmail.com designates 209.85.200.168 as permitted sender) Received: from [209.85.200.168] (HELO wf-out-1314.google.com) (209.85.200.168) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Dec 2008 04:15:17 +0000 Received: by wf-out-1314.google.com with SMTP id 27so1997465wfd.21 for ; Sun, 21 Dec 2008 20:14:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :content-type:content-transfer-encoding:mime-version:subject:date :x-mailer; bh=zsMaaUO5kw7qBN0ri9KyaHyvH78Y/sUQ1lfSF1Ol/Dk=; b=AaZZkYQY2v3o9N98NubTXBhBmPytuNMFwiKJlYQHlRascVsxo91kbrP2+vx9D4ykz/ FGlo9i0CyPCePKidnUMte+Ook+qSEKza/RPzoxw/HAq6qliYYpMR20J+FB28Xv44pXuF F+JLQPsERDApPZ35ykRcqT81k+CAGR26ygD20= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:content-type:content-transfer-encoding :mime-version:subject:date:x-mailer; b=efwX4YSLQfHJAzl1gCNvZ0Jw6Kpte4tuDK9+mCrG7U8HrsYyZHbmJDwbqDdQR1ebED xdHPyskitl7uNIVPSgg0Tr/yF3sN8Zmp1VXqx1KjiScmtbqIXCYdyfZBv2oKLIog3zpK 1jH1MRxdTvX9LRaKBRHYpCKuTGjsZLtHlZOlk= Received: by 10.142.213.9 with SMTP id l9mr2498010wfg.287.1229919297684; Sun, 21 Dec 2008 20:14:57 -0800 (PST) Received: from ?192.168.0.16? (ppp121-45-41-103.lns10.adl2.internode.on.net [121.45.41.103]) by mx.google.com with ESMTPS id 9sm25711634wfc.56.2008.12.21.20.14.56 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 21 Dec 2008 20:14:57 -0800 (PST) Message-Id: From: Antony Blakey To: couchdb-dev@incubator.apache.org Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Subject: History of deletion, and the interaction with compactions Date: Mon, 22 Dec 2008 14:44:53 +1030 X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org I've just done some testing and noticed that if I delete documents and then do a compaction, then the record of the document's deletion remains in the _all_docs_by_seq view. This includes the document id. This implies that the database will forever contain some details about deleted documents (the document id and sequence number), and hence the database size has a monotonic increasing minimum bound related to the number of documents ever created and the size of their document ids. This removes an issue with _externals missing deletions, and it makes ad-hoc id-based replication possible, but has any thought been given to this issue for high throughput use cases? Antony Blakey ------------- CTO, Linkuistics Pty Ltd Ph: 0438 840 787 A Buddhist walks up to a hot-dog stand and says, "Make me one with everything". He then pays the vendor and asks for change. The vendor says, "Change comes from within".