incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sho Fukamachi <sho.fukama...@gmail.com>
Subject Re: incremental map/reduce scope
Date Wed, 26 Mar 2008 03:05:05 GMT
Thanks for the responses Chris and Jan.

That's pretty much exactly what I thought - I just wanted to clear it  
up!

The reason I asked was because I've seen what I consider to be some  
not-very-good practise in some interface programs for Couch. For  
example, in ActiveCouch, databases are treated basically like tables -  
it seems to imply one should create a database called "people", for  
example, and then another one called "blogs". This seemed to me to be  
a pretty bad idea - trying to force CouchDB into a very RDBMS-shaped  
hole, so I was wondering if they knew something I didn't about the  
present or future scope of "map" - evidently not. I am not sure how  
prevalent this kind of thinking is but evidently it needs to be  
discouraged.

>> I'm not a CouchDB internals expert, but I imagine you could use
>> replication to merge two databases at any point in the future.
>
> You are correct, this works today and will in the future.

OK, well, at least this means bad design today will probably be  
correctable tomorrow : ) .

>> I'm also pretty sure that map functions can't reach into the contents
>> of more than one document (although reduce will be able to merge all
>> data with the same key, I think, so you could perhaps "join" together
>> related documents using a clever map function, although you may be
>> better off doing joins in your application.)
>
> With the possibility of using other languages for the M/R functions,
> you could, for example, use Python to run the functions and from
> within the functions use an HTTP request to get data from another
> DB. If you really wanted that. I'd guess though that you'd be better
> off with using a database merge with replication or implementing
> things on the application level. I might be wrong, though :)

Ha, that sounds like a nightmare! I don't think anyone should be  
thinking about that kind of thing except as a last resort.

I guess the scenario I thought could be possible was that design docs  
could be definable at the root level of the server, above the  
databases, and then look into all docs in all databases. It's no  
surprise, though, that this isn't done - there would be no advantage  
beyond a superficially more pleasant namespace.

Has the community arrived at a kind of "best practise" for  
differentiating between record types? In the wiki using the key "type"  
is suggested, I am wondering how official this is. Not that it needs  
to be official, of course, but establishing some sort of convention is  
probably a good idea to get everyone on the same page.

If it is, what might be nice would be to promote "type" in a doc to a  
reserved magic word like _id or _rev. A built-in function to get "all  
documents by type" and/or present a list of filterable types in the  
web interface would perhaps help make the database more approachable  
to RDBMS refugees and help encourage people towards understanding  
document-based data structures, and away from the table-based  
mentality. Just a thought.

thanks a lot!

Sho
Mime
View raw message