Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 22130 invoked from network); 27 Apr 2008 01:13:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Apr 2008 01:13:36 -0000 Received: (qmail 44173 invoked by uid 500); 27 Apr 2008 01:13:38 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 44143 invoked by uid 500); 27 Apr 2008 01:13:38 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 44132 invoked by uid 99); 27 Apr 2008 01:13:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 Apr 2008 18:13:37 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of amills1037@gascard.net designates 64.233.178.243 as permitted sender) Received: from [64.233.178.243] (HELO hs-out-0708.google.com) (64.233.178.243) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Apr 2008 01:12:50 +0000 Received: by hs-out-0708.google.com with SMTP id 23so3457756hsn.12 for ; Sat, 26 Apr 2008 18:13:02 -0700 (PDT) Received: by 10.90.25.10 with SMTP id 10mr8343886agy.21.1209258781960; Sat, 26 Apr 2008 18:13:01 -0700 (PDT) Received: from ?192.168.0.100? ( [74.197.49.140]) by mx.google.com with ESMTPS id 7sm4292205hsx.1.2008.04.26.18.13.00 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 26 Apr 2008 18:13:01 -0700 (PDT) Message-Id: <869B592C-9298-4C1C-88F9-3A29ECC7FA0C@gascard.net> From: Anthony Mills To: couchdb-user@incubator.apache.org In-Reply-To: <50BF4DF7-9BEE-4914-B357-DB88D0C320D4@apache.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v919.2) Subject: Re: Views Date: Sat, 26 Apr 2008 20:12:59 -0500 References: <0B8FCF82-C889-4BF6-8CF8-AE9040BD76B5@gascard.net> <7E66C8D2-10AC-455D-9C12-E520E3FF6439@gascard.net> <50BF4DF7-9BEE-4914-B357-DB88D0C320D4@apache.org> X-Mailer: Apple Mail (2.919.2) X-Virus-Checked: Checked by ClamAV on apache.org Thank you everyone for answering my questions. Here is the way I understand it. The first time a view is run it creates a key-values list from all documents. Future calls to the view, update the key-value list with changed documents [added, deleted, updated]. If a startkey, endkey or key is used, only those keys that match in the list are returned. If I use a different startkey, endkey or key, the key-value list is not rebuilt, it uses the keys from the first view. Did I get it right? Sorry about being obtuse. I have a project that can have over 10 million documents and I need to understand how they can be indexed. Thank you, Anthony On Apr 26, 2008, at 2:11 PM, Jan Lehnardt wrote: > Heya Anthony, > On Apr 26, 2008, at 20:50, Anthony Mills wrote: >> Maybe I missing something. When you create a view, does it create >> indexes for attributes in the database? When you add new >> documents, do they automatically create the index for the >> attributes for the view? > > A view index only has a single index which is what you send in as > the first argument in the map() function. Nothing else is going on > automatically. > > >> Also, can I call my view with soemthing like ? >> startkey=['20080403t000000', 1234]&endkey=['20080405t235959', 1234] >> to >> >> function(doc){ >> if(doc.type == "hello"){ >> map([doc.date, doc.number], doc); >> } >> } >> >> Then, through the magic of couchdb, I'll only get back those >> documents between the April 3rd and 5th whose attribute number=1234? > > Nope, you'd need a [doc.number, doc.date] index for that. It is > rather straightforward than magical. The map() function just creates > a key-value list that is sorted by key and you can query only ranges > within the key-space. > > >> Will couchdb only search through records that match the key? or >> will it need to go through all documents every time I call the view? > > To build the view index CouchDB will go through all documents. But > only once. For documents that change, get deleted or added, CouchDB > incrementally updates the index. Also, view indexes are build when > you query the view, not when you add documents. > > >> To get nerdy, I want my views to find records in O(log n) not O(n). > > You get your results in O(1) ;-) (after the first query to each view). > > In relational terms, think of a view as an index on a column without > the write penalty. So have as much as you might need. > > I hope that helps, feel free to send more questions :) > > Cheers > Jan > -- > > > >> >> >> Thanks, >> >> Anthony >> >> On Apr 26, 2008, at 1:02 AM, Chris Anderson wrote: >> >>> Anthony, >>> >>> http://wiki.apache.org/couchdb/ViewCollation is the way to >>> accomplish >>> tasks like that. >>> >>> Christopher Lenz has a write-up of how to use view collation to sort >>> views, achieving comments grouped by parent blog post. >>> >>> http://www.cmlenz.net/archives/2007/10/couchdb-joins >>> >>> In your case you could index a view with date and type, like this >>> >>> [type, date] >>> >>> and then if you had say 5 types you'd do 5 GET queries against the >>> database, each one fetching only the documents for that day. >>> >>> View collation is one of my favorite things about CouchDB. I'm >>> excited >>> about reduce, because from what I understand, you could use it to >>> lower this to 1 GET, if that's important to you. >>> >>> enjoy, >>> Chris >>> >>> On Fri, Apr 25, 2008 at 9:34 PM, Anthony Mills >> > wrote: >>>> I read most of the documentation, wiki and blogs, but I still do >>>> not see how >>>> to accomplish a certain scenario. Hopefully I can describe it >>>> adiquitely. >>>> >>>> Lets say I have 1,000,000 documents [all of the same "type"] with >>>> a date >>>> attribute. Lets say I want to pick a subset of those documents. >>>> How can I >>>> pick those documents of one type that fall on one day? Will I >>>> need to get >>>> all 1,000,000 documents? What if I want all documents of one >>>> type on one >>>> day that match another attribute? >>>> >>>> I pretty sure this is what map/reduce will help with, but is >>>> there a way to >>>> do this now? Can you use more documents to build date relations? >>>> >>>> Also, can you pass more variables than just key to >> >> >