incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: General-understanding questions about views
Date Sat, 28 Jun 2008 09:49:07 GMT

On Jun 28, 2008, at 08:03, David King wrote:

> I'm trying to gain a fundamental understanding of views and indexed  
> data. If this is documented in a FAQ, please direct me there  
> instead :)
>
> In trying to map my understanding from SQL,

Here we have to tackle the first issue: Do not try to map what you know
from SQL to CouchDB. Try to independently understand, how CouchDB
works and then try to apply your problems to it. A translation will not
work and possibly leave you thinking CouchDB is crap because it is not
an RDBMS which is surely not the case. On the other hand, it might
be perfectly possible that CouchDB is not the right tool for your job,
but it is certainly cool that you are checking it out :)


> it appears that the answer to quickly querying data is by pre- 
> calculating query result-sets and storing them in tables, called  
> views. A view is table populated by a function that runs against  
> every object that is written or modified in the database.
>
> 1. How would you implement a query against a value that changes  
> after the view is populated, like the current time? That is, if I  
> wanted things younger than a week, a permanent view like this:
>
> function(doc) {
> 	if(doc.date > now() - timeinterval('1 week')) {
> 		emit(null,doc);
> 	}
> }
> (date-syntax liberally made up) the results of that query, if  
> populated when the data is changed, would quickly be invalid,  
> because now() has changed. Is this accurate? How would you  
> performantly run a query like this?

Your map functions must return the same result for the same input, so
things like now() can not be used. And you usually don't. The most
interesting feature of the result set (or table as you call it) of the  
map
function is that the 'first column', the 'key' can be used for fast  
lookups.
So what you would do here instead, is:

function(doc) {
   emit(doc.date, null);
}

and query with /db/_view/date/name?startkey=timestamp_from_interval('1  
week')&endkey=now()

Looking up this can be done in constant time.

> 2. Same question for a permanent view containing the youngest 10  
> items (this one might be easier)?

Same thing. I note that you explicitly mention permanent views. Do not
use temporary views in production, only during development.


> 3. The wiki doesn't mention parameterised views. So if I have a  
> document with an 'author' field, and I want a view such that I can  
> see everything that a given author wrote, do I need a view per  
> author? Given thousands of authors, what is the performance cost for  
> running a document through a few thousand author-functions?

Same as above:

function(doc) {
   emit(doc.author, null);
}

GET /db/_view/authors/name?key=authorname

One view, extremely fast lookups.


> 4. I know that the distribution bits are still being fleshed out,  
> but is it the intention that eventually views can be stored or  
> calculated on a separate server from the data (since they are  
> implemented as tables)?

Not sure what you mean with 'since they are implemented as tables', but
maybe that is just the SQL-lingua that is confusing me. We don't have
tables (things might look like them, though). But yes, eventually, you  
will
be able to distribute view creation. We haven't gotten around to to  
that yet.

Feel free to send in more questions as they come :-)

Cheers
Jan
--

Mime
View raw message