couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Filippo Fadda (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-1868) Using multiple keys, the _all_docs built-in view acts differently then a user defined view
Date Tue, 13 Aug 2013 00:39:49 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737632#comment-13737632
] 

Filippo Fadda commented on COUCHDB-1868:
----------------------------------------

> Who, or what, is actually impacted by this?

Everyone is developing an application that needs to join data from different views. So, everyone
is developing any real application that is not a simple todo list.
I don't think this modification will have a great impact on existence code anyway. There is
no implication on user defined views, because to get rows for not found keys you have to explicit
add a query parameter to the request. And I can't think a case to query _all_docs using multiple
keys, since the keys, in this case, are the document ids. I tell you why. If you are querying
_all_docs using multiple keys to get multiple documents, your are doing that because you can't
apply include_docs=true to a user defined view. This may happens, for example, if you omit
as key a date or something like that. That's because I'm the first person to notice this inconsistency.

Design bugs are bugs, often more important than the others, and they should be fixed.

Returns null rows is really important because you can easily make joins, otherwise, to achieve
the same result you have to make a ton of queries, and in a real world application this can't
happen. I tried this on my skin. Without null rows, to get every detail inherent a post, you
have to make 5 other queries (or more) per post. This means for 30 posts, 150 additional queries,
when you should just make five and aggregate easily data. This is gonna boost incredibly application
performances. And everyone knows Achille's heel of CouchDB is the 'slow' HTTP protocol and
JSON parsing/conversion.

CouchDB is still something we can call 'new', and it's far from perfection. There is a huge
space for improvements and I think this is a big one.

Another huge problem I found is related to count hits. I solved it using Redis, because saving
a document for every hit is not applicable in practice. When you have an existence database,
you need days just to generate and emit the 'hits'. Importing pre-existence data become impossible,
unless you have days to import them. And finally tracking hits require ton of gigabytes space.
I think CouchDB should provide some mechanism to deal with things like this. With a relational
database you can do all, with CouchDB you can't, actually, but I think CouchDB should move
in a "problem solving" direction. As user I can accept compromises in favor of scalability
and reliability, but I can't relay on another DBMS for any other thing.
                
> Using multiple keys, the _all_docs built-in view acts differently then a user defined
view
> ------------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1868
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1868
>             Project: CouchDB
>          Issue Type: Bug
>          Components: View Server Support
>            Reporter: Filippo Fadda
>
> When you query a view using multiple keys, the _all_docs built-in view acts differently
then a user defined view:
> 1) in the first case CouchDB returns "not_found" for every not found key;
> 2) querying a user defined view produces, instead, an empty array.
> In the first case you obtain error="not_found" for every key, when you query a user defined
view you simply don't get any rows, just the total rows for the view.
> See: http://pastebin.com/D7NExJrd
> Now, regarding 'keys' the documentation says something like: "Used to retrieve just the
view rows matching that set of keys. Rows are returned in the order of the specified keys."
> In a normal case, CouchDB should return just a row for each matched key, but it will
really help, having an option to return a row for every key, even there if not found, because
it's more easy, cycle through results.
> Let's suppose the application I'm doing gets the last 30 blog posts, displaying for each
one, information that are stored into related documents. The application will query, using
as keys the posts' identifiers, other views to get, for example, if a post has been starred
from the current logged-in user, etc.
> If a view always returns a number of rows equals to the number of keys, the application
can cycle from 0 to 29 and display all the related information for a post.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message