incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Views
Date Sun, 27 Apr 2008 08:07:20 GMT

On Apr 27, 2008, at 03:12, Anthony Mills wrote:
> Thank you everyone for answering my questions.
>
> Here is the way I understand it.  The first time a view is run it  
> creates a key-values list from all documents.  Future calls to the  
> view, update the key-value list with changed documents [added,  
> deleted, updated].
> If a startkey, endkey or key is used, only those keys that match in  
> the list are returned.

Where "match" mean either single entries from the view-index or  
consecutive ranges, but noting with gaps.


> If I use a different startkey, endkey or key, the key-value list is  
> not rebuilt, it uses the keys from the first view.
> Did I get it right?

Yes.


> Sorry about being obtuse. I have a project that can have over 10  
> million documents and I need to understand how they can be indexed.

As Chris mentioned, it might be best to play around with sample data  
to get a feel for views. Check out Futon, our built in administration  
client, it lets you define ad-hoc queries that you can modify at your  
will and later save permenently: http://localhost:5984/_utils/

Cheers
Jan
--


>
>
> Thank you,
> Anthony
>
> On Apr 26, 2008, at 2:11 PM, Jan Lehnardt wrote:
>
>> Heya Anthony,
>> On Apr 26, 2008, at 20:50, Anthony Mills wrote:
>>> Maybe I missing something.  When you create a view, does it create  
>>> indexes for attributes in the database?  When you add new  
>>> documents, do they automatically create the index for the  
>>> attributes for the view?
>>
>> A view index only has a single index which is what you send in as  
>> the first argument in the map() function. Nothing else is going on  
>> automatically.
>>
>>
>>> Also, can I call my view with soemthing like ? 
>>> startkey=['20080403t000000', 1234]&endkey=['20080405t235959',  
>>> 1234] to
>>>
>>> function(doc){
>>> 	if(doc.type == "hello"){
>>> 		map([doc.date, doc.number], doc);
>>> 	}
>>> }
>>>
>>> Then, through the magic of couchdb, I'll only get back those  
>>> documents between the April 3rd and 5th whose attribute number=1234?
>>
>> Nope, you'd need a [doc.number, doc.date] index for that. It is  
>> rather straightforward than magical. The map() function just  
>> creates a key-value list that is sorted by key and you can query  
>> only ranges within the key-space.
>>
>>
>>> Will couchdb only search through records that match the key? or  
>>> will it need to go through all documents every time I call the view?
>>
>> To build the view index CouchDB will go through all documents. But  
>> only once. For documents that change, get deleted or added, CouchDB  
>> incrementally updates the index. Also, view indexes are build when  
>> you query the view, not when you add documents.
>>
>>
>>> To get nerdy, I want my views to find records in O(log n) not O(n).
>>
>> You get your results in O(1) ;-) (after the first query to each  
>> view).
>>
>> In relational terms, think of a view as an index on a column  
>> without the write penalty. So have as much as you might need.
>>
>> I hope that helps, feel free to send more questions :)
>>
>> Cheers
>> Jan
>> --
>>
>>
>>
>>>
>>>
>>> Thanks,
>>>
>>> Anthony
>>>
>>> On Apr 26, 2008, at 1:02 AM, Chris Anderson wrote:
>>>
>>>> Anthony,
>>>>
>>>> http://wiki.apache.org/couchdb/ViewCollation is the way to  
>>>> accomplish
>>>> tasks like that.
>>>>
>>>> Christopher Lenz has a write-up of how to use view collation to  
>>>> sort
>>>> views, achieving comments grouped by parent blog post.
>>>>
>>>> http://www.cmlenz.net/archives/2007/10/couchdb-joins
>>>>
>>>> In your case you could index a view with date and type, like this
>>>>
>>>> [type, date]
>>>>
>>>> and then if you had say 5 types you'd do 5 GET queries against the
>>>> database, each one fetching only the documents for that day.
>>>>
>>>> View collation is one of my favorite things about CouchDB. I'm  
>>>> excited
>>>> about reduce, because from what I understand, you could use it to
>>>> lower this to 1 GET, if that's important to you.
>>>>
>>>> enjoy,
>>>> Chris
>>>>
>>>> On Fri, Apr 25, 2008 at 9:34 PM, Anthony Mills <amills1037@gascard.net

>>>> > wrote:
>>>>> I read most of the documentation, wiki and blogs, but I still do  
>>>>> not see how
>>>>> to accomplish a certain scenario.  Hopefully I can describe it  
>>>>> adiquitely.
>>>>>
>>>>> Lets say I have 1,000,000 documents [all of the same "type"]  
>>>>> with a date
>>>>> attribute.  Lets say I want to pick a subset of those  
>>>>> documents.  How can I
>>>>> pick those documents of one type that fall on one day?  Will I  
>>>>> need to get
>>>>> all 1,000,000 documents?  What if I want all documents of one  
>>>>> type on one
>>>>> day that match another attribute?
>>>>>
>>>>> I pretty sure this is what map/reduce will help with, but is  
>>>>> there a way to
>>>>> do this now?  Can you use more documents to build date relations?
>>>>>
>>>>> Also, can you pass more variables than just key to
>>>
>>>
>>
>
>


Mime
View raw message