incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mister Donut <lady.do...@gmail.com>
Subject Re: The Blog
Date Tue, 10 Feb 2009 06:34:22 GMT
> On the opposite end of the spectrum, we have extremely large RDBMS
> installs on huge iron. IIRC, I read an article that the 37signals crew
> just bought a 32 GiB machine to scale up Basecamp.
> the whole system would require many man hours of systems engineering
> or a huge rewrite of the base application logic.

Yeah but CouchDB doesn't magically solve that problem, does it?
RDBMS + Memcached goes a very long way.

Also, Basecamp seems to be "easy" to partition, "like" Flickr, (mind
you, "easy"!), because most accounts are "self-contained". There is a
project, a few users. They don't overlap. Of course, once you detach
users from their projects, ... or allow users to comment on
everything, that's where it gets hard? The problems start when
everything relates to everything. (see:)

> Another thought that just occurred to me. Another way of describing
> the difference is that in CouchDB the data is important. In an RDBMS,
> it's the relations that are important (or the focus at least).

That is a very interesting point. I tend to agree after these emails.
Also, most example applications of CouchDB that users have presented
in this very thread seem to be about data, and not about relations.
The sync to S3, the message queue with included aggregating
reporting... Whereas a typical web application (wiki, blog),
everything is about relations. Isn't that so? Users in user groups
with permissions writing posts belonging to categories, having
comments by other users. I don't really see how you can just throw
that out of the window?! I mean, exactly why would you use the
key/value pairs user_x, permission_x, group_x, entry_x, comment_x
instead of just five tables. I don't understand where the CouchDB
implementation shines for just exactly that thing. Mind you, this is
RDBMS thinking (again), and I totally see the reason to use CouchDB
for those projects outlined here (S3, Queue).

I think there is a bit of a problem of the approach here. Most
"convinced" CouchDB users seem to want to tell you "Well RDBMS suck,
CouchDB is much better", "Why?", "Well, because... [Concepts,
Key/Value, Map/Reduce]". Instead of trying to show you a few hands-on
approaches of solving a problem *that has not been solved* by RDBMS
yet.

I think the message reporting queue with aggregating, an example like
that, instead of just "The Blog in 5 Minutes" you see everywhere,
would go a long way into showing what CouchDB is all about.

It can show how useful Map/Reduce can be (to create aggregate
reports), and how you can possibly have two message queues that can
stay in sync.

> Yes, you can. Just not in the way you're used to thinking. A
> Map/Reduce view is a fixed *mapping* from documents to a sorted
> key/value space.

Yes, you just said it. *Fixed*. If you have 200 documents, 100 from
Jan to Nov, and 100 from Nov to Dec, there is no way you can fill them
into two buckets ("Jan-Nov" and "Nov-Dec"). It would require variable
conditions.

> Also, you may count things.

I never said you couldn't. I said you cannot count like += and you
cannot aggregate counts to get rid of all the documents. Let's say you
want to count pageviews. Easy, insert a document for every pageview,
create a "sum-view". But, this will lead to way too many documents?
Doesn't seem feasible. Of course, CouchDB isn't the tool for that job,
but I would still like to see some really hands on examples of what
CouchDB can do. I think we covered the concepts now.

> Patrick was trying to help and was correct.

No, he is not.

Mime
View raw message