Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 49715 invoked from network); 23 Sep 2010 13:45:39 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 23 Sep 2010 13:45:39 -0000 Received: (qmail 76097 invoked by uid 500); 23 Sep 2010 13:45:39 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 75679 invoked by uid 500); 23 Sep 2010 13:45:37 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 75669 invoked by uid 99); 23 Sep 2010 13:45:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Sep 2010 13:45:36 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paul.joseph.davis@gmail.com designates 209.85.214.180 as permitted sender) Received: from [209.85.214.180] (HELO mail-iw0-f180.google.com) (209.85.214.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Sep 2010 13:45:30 +0000 Received: by iwn8 with SMTP id 8so1995023iwn.11 for ; Thu, 23 Sep 2010 06:45:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=1VTePWos557/+qMAgFW5LSZJJ/i5o/P9hM7BcpOdXBM=; b=ZZl4R19ZaZWXCFeqHQLFYAT/rGyUz6AXeE5C7LoGeF43EAUbE7whjuyUq6wm0Nu7VX Ndd4LHr+EX6PvRjJin9CbPoXCqXPvdNC3siHFREw2uVEo48D0tuSz5rTlmIRUeivG6ri fd+98Vf5WJbm6UuohCnTw/mFlHa40kaUhEeM0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=FpB18tsG9Uohn5qS6EJ0QMlEU7aEzB80qXvdqfZ8E/D0BybPP6gJ5ozD8quRqVUeDT YuHbF8Wj3m25CPJVH6FxZpFY4yGAYcVMIG/22DkwuyLwuEcx0O8xxvsA2aDFdwt8ay37 K6wpxElNGyd+CDs0IwiCMJ9NTKCL8FJfjGhv4= Received: by 10.231.17.11 with SMTP id q11mr2179114iba.63.1285249509181; Thu, 23 Sep 2010 06:45:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.30.194 with HTTP; Thu, 23 Sep 2010 06:44:28 -0700 (PDT) In-Reply-To: References: From: Paul Davis Date: Thu, 23 Sep 2010 09:44:28 -0400 Message-ID: Subject: Re: question about how write_header works To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Its not appended each time data is written necessarily. There are optimizations to batch as many writes to the database together as possible as well as delayed commits which will write the header out every N seconds. Remember that *any* write to the database is going to look like wasted space. Even document deletes make the database file grow larger. When a header is written, it contains checksums of its contents and when reading we check that nothing has changed. There's an fsync before and after writing the header which also help to ensure that writes succeed. As to the header2 or header1 problem, if header2 appears to be corrupted or is otherwise discarded, the header search just continues through the file looking for the next valid header. In this case that would mean that newData2 would not be considered valid data and ignored. HTH, Paul Davis On Wed, Sep 22, 2010 at 11:51 PM, chongqing xiao wrote: > Hi, Adam: > > Thanks for the answer. > > If that is how it works, that seems create a lot of wasted space > assuming a new header has to be appended each time new data is saved. > > Also, assuming here is the data layout > > newData1 =A0 ->start > header1 > newData2 > header2 =A0 =A0 =A0-> end > > If header 2 is partially written, I am assuming newData will also be > discarded. If that is the case, I am assuming there is a special flag > in header 1 so the code can skip newData2 and find header1? > > I am very interested in couchdb and I think it might be a very good > choice for archiving relational data with some minor changes. > > Thanks > Chong > > On Wed, Sep 22, 2010 at 10:36 PM, Adam Kocoloski wr= ote: >> Hi Chong, that's exactly right. =A0Regards, >> >> Adam >> >> On Sep 22, 2010, at 10:18 PM, chongqing xiao wrote: >> >>> Hi, >>> >>> Could anyone explain how write_header (or header) in works in couchdb? >>> >>> When appending new header, I am assuming the new header will be >>> appended to the end of the DB file and the old header will be kept >>> around? >>> >>> If that is the case, what will happen if the header is partially >>> written? I am assuming the code will loop back and find the previous >>> old header and recover from there? >>> >>> Thanks >>> >>> Chong >> >> >