couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Landolt <d...@deanlandolt.com>
Subject Re: view on view
Date Thu, 12 Feb 2009 17:08:31 GMT
On Thu, Feb 12, 2009 at 9:28 AM, Jan Lehnardt <jan@apache.org> wrote:

>
> On 12 Feb 2009, at 15:14, Евгений Найденышев wrote:
>
>  (I'm sorry for my english)
>>
>
> No worries, I guess the majority reading here are no native
> speakers (like me) :)
>
>
>  Can I run some view on documents that was resulted by another view?
>>
>
> Not yet no. There have been discussions about this and different
> ways to do it, but there are no definite plans to support this out of
> the box. I believe there are client implementations that let you do
> this by saving a view result in a new database and run the second
> view there.


It seems to me that once partitioning lands, temp views could take on a
whole new utility. There was a conversation a little while back on irc about
mapping over huge mbox files (or logs, or anything *massive *for that
matter) and how views wouldn't make sense, or at least may require more
storage than they're worth. In the original m/r paradigm though *every *query
is a "temp view" -- but when you have a metric shitton of compute nodes,
that doesn't matter. Huge blocks of data are shuffled off to these compute
nodes to hang out (a la Google FS or HDFS) to keep jobs from being bound by
network io, and when you can distribute a db across a btree of couch
instances, couch would do that for us too...

Temp view queries start to look a lot sexier than persisting views in
certain cases. Sure, you wouldn't run a front-end app off of them, but this
one use case is all Hadoop, and on top of that tools like Pig and Hive do.
Yet they're performant enough to be useful in the general case and
*clutch*for massive data sets.

Without a high enough ratio of compute power to data, temp views would still
be shite, but partitioning could fix this -- or at least make the ratio more
easily tunable. In this light, Damien's name slow_views makes a lot more
sense. They'll never have millisecond response times but they'll be more
than a simple hack -- they'll give us a kickass way to have chainable m/r
without any of the bookkeeping hassles.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message