Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 90182 invoked from network); 17 Mar 2011 01:19:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Mar 2011 01:19:33 -0000 Received: (qmail 8944 invoked by uid 500); 17 Mar 2011 01:19:32 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 8913 invoked by uid 500); 17 Mar 2011 01:19:32 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 8905 invoked by uid 99); 17 Mar 2011 01:19:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Mar 2011 01:19:32 +0000 X-ASF-Spam-Status: No, hits=4.0 required=5.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dominic.tarr@gmail.com designates 74.125.82.180 as permitted sender) Received: from [74.125.82.180] (HELO mail-wy0-f180.google.com) (74.125.82.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Mar 2011 01:19:26 +0000 Received: by wyj26 with SMTP id 26so2707563wyj.11 for ; Wed, 16 Mar 2011 18:19:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=8wx7rbiu3Sxqiyo8+Eq1j9YTnL6VSA7iF7DDyzqVSho=; b=idttPZ3plA25AkUizfhyL/sGyZIR8kvKSWpgikJDueZ6kHtZJNgTtqYYTtdqw05Fgd AdVzYDHLDMmAnoyxHRQSQYvUq2m7raXCzRv8jZdGNR5T8kmS0alg+dVhKvuDf35Nh8CR RgFd5DvmcOZ5X1SyUvdVu7pgE3KOgkW6BVcZo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=FZNlWznG69aVDH1PIjiomiEK6lwtwrMScOVtPCid8B1hZb7bJB+eRKf2sosVVNR0h3 aAux0mLoeBDZZ1o55Pfcjuv/t0AhoGASJXPTl0/iNCrklU8y4fIjxQ2WRLnZe6FFFipu gTrd+oyLLj8vWTNpIkdGwep/wZnYXV17m9Q34= MIME-Version: 1.0 Received: by 10.227.206.84 with SMTP id ft20mr667614wbb.161.1300324744437; Wed, 16 Mar 2011 18:19:04 -0700 (PDT) Received: by 10.227.145.72 with HTTP; Wed, 16 Mar 2011 18:19:04 -0700 (PDT) In-Reply-To: References: Date: Thu, 17 Mar 2011 14:19:04 +1300 Message-ID: Subject: Re: Some CouchDB internals questions? From: Dominic Tarr To: user@couchdb.apache.org Content-Type: multipart/alternative; boundary=0015174a0fda7c1f84049ea3722f --0015174a0fda7c1f84049ea3722f Content-Type: text/plain; charset=ISO-8859-1 do you know about the include_doc option on view queries? if you emit(key, doc) in the view, then the whole doc will be saved in the view cache, but if you query a view with the include_doc=true then the view results will include the whole doc. this is fetched from the db and not stored in the view. this will slow reads, but use less space. which option is best depends on your application. it's mentioned here: http://wiki.apache.org/couchdb/HTTP_view_API On Thu, Mar 17, 2011 at 5:53 AM, Zdravko Gligic wrote: > WOW ! > > So, how long might it take for this not only to become part of CouchDB > core but then also to get implemented by all of the ohter CouchDB > dialects such as CouchBase and BigCouch ,etc ? > > And as dumb as it might sound ;) why was this not done (: the right > way :) from the very beginning ;?) > > On Wed, Mar 16, 2011 at 10:02 AM, Filipe David Manana > wrote: > > Zdravko, > > > > Yesterday a performance related ticket was created: > > > > https://issues.apache.org/jira/browse/COUCHDB-1092 > > > > Apart from the performance improvements, it also reduces very > > significantly the database sizes (from 2 times less to about 10 times > > less). So you might be interested to follow/read. > > > > On Tue, Mar 15, 2011 at 7:32 PM, Paul Davis > wrote: > >> On Tue, Mar 15, 2011 at 2:53 PM, Zdravko Gligic > wrote: > >>>> Have you compacted your db and views? > >>> > >>> Yes > >>> > >>>> There's unfortunately no direct way to calculate a upper threshold, it > >>>> really depends on your method for inserting as well as how often you > >>>> compact. > >>> > >>> Once both (docs and view) are compacted, is the resulting size at all > >>> dependent on how the docs and/or views were created in the first place > >>> (one at a time or in bulk or whatever) ? > >>> > >> > >> I think to get the absolute minimum post-compaction size you need to > >> compact twice. I haven't done lots of extensive testing on this, but > >> last I recall the basic logic was the first time can end up writing > >> docs in a somewhat randomish ordering depending on how they were > >> inserted. > >> > >>>> This is due to the tail append storage which will orphan data > >>>> in the file as it writes new records to the various internal data > >>>> structures. > >>> > >>> My 1,500 docs are taking up almost 15 meg (roughly 1/4-1k docs with 2 > >>> views + 1 view with doc re-emit) and I believe were around 50meg > >>> before compactions. > >>> > >> > >> More importantly, what was the datasize post-compaction though? If > >> your main db is 15Meg, and you have a view that re-emits the doc, I'd > >> expect you to have a total size of at least 30Meg. Depending on what > >> you're emitting in the other two views getting closer to that 50 isn't > >> hugely out of the question. > >> > > > > > > > > -- > > Filipe David Manana, > > fdmanana@gmail.com, fdmanana@apache.org > > > > "Reasonable men adapt themselves to the world. > > Unreasonable men adapt the world to themselves. > > That's why all progress depends on unreasonable men." > > > --0015174a0fda7c1f84049ea3722f--