Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 726EA105F4 for ; Mon, 12 Aug 2013 17:10:54 +0000 (UTC) Received: (qmail 25053 invoked by uid 500); 12 Aug 2013 17:10:54 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 24417 invoked by uid 500); 12 Aug 2013 17:10:50 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 23953 invoked by uid 99); 12 Aug 2013 17:10:48 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Aug 2013 17:10:48 +0000 Date: Mon, 12 Aug 2013 17:10:48 +0000 (UTC) From: "Filippo Fadda (JIRA)" To: dev@couchdb.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (COUCHDB-1868) Using multiple keys, the _all_docs built-in view acts differently then a user defined view MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COUCHDB-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737049#comment-13737049 ] Filippo Fadda commented on COUCHDB-1868: ---------------------------------------- OK, I got your point. Now, in a relational database you have joins, in CouchDB you can "simulate" a join, using multiple views, and aggregate the obtained results. But it's pretty hard to do actually, because CouchDB returns just the rows have been found. Remember the fields returned by a relational DB are already joined together, in CouchDB you have to join them instead. Actually is pretty hard to do. As you know each row is an array. Let's suppose your are creating an application to show the Marvel heroes. The application shows the last 10 Marvel heroes to the user, and must display a star on each one if the hero has been starred. So the application query a view called allLatest store in the doc 'heroes', asking the last 10 heroes added (they are order by a UNIX timestamp, an integer value). The query return something like: { "total_rows":23370, "offset":0, "rows":[ {"id":"6f4927a6-b891-4e12-9238-1b965d9bb0f4","key":1361281522,"value":"Spiderman"}, {"id":"bd0678ee-61c8-4933-a09b-729bc10a4af5","key":1360580897,"value":"Superman"}, {"id":"a6ce26b5-77e1-4a03-a96b-e63e030691a9","key":1360580766,"value":"Batman"}, {"id":"86a0e140-cf45-4198-f571-164cdc6bf9a3","key":1360079876,"value":"Hulk"}, {"id":"6b0dde6a-fba0-4b42-f1b6-9885e3b80e12","key":1359548425,"value":"Tor"}, {"id":"adf0fc73-add7-41bb-80ef-f3211b310019","key":1359002601,"value":"Captain America"}, {"id":"5555bf45-025e-454a-94bf-d14f936854ba","key":1356289394,"value":"Wolverine"}, {"id":"6f5e4c73-4456-425a-c99e-aa8fe2c73637","key":1355942176,"value":"Magneto"}, {"id":"f92cce76-db3b-4a1e-ecdc-ae0a751e560a","key":1355751477,"value":"Rockman"}, {"id":"a451bda7-fb9b-4df7-c584-88d5a7ecc1f7","key":1354887987,"value":"Robin"}, ] } Then the application queries another view to obtain the ones have been starred by the current logged-in user: { "total_rows":3456, "offset":0, "rows":[ {"id":"cd0a4cad-2ff4-4fcf-9a9c-d0d2c6bc6a20","key":6f4927a6-b891-4e12-9238-1b965d9bb0f4,"value":null}, {"id":"3756f01d-cde1-41cb-81a2-33d63d2f1b19","key":bd0678ee-61c8-4933-a09b-729bc10a4af5,"value":null}, {"id":"5e9d6533-2859-48c9-de50-bbc5d6da279a","key":a6ce26b5-77e1-4a03-a96b-e63e030691a9,"value":null} ] } Unfortunately CouchDB returns just 3 rows. To know if a post has been starred you must write something really awful and pretty slow, especially when you have many records. Since there is no match between your first query and the second, you can't simply access the array using an index, you have to cycle every single array and checking if the item array 'key' exists. This really sucks. It's awful, slow and ugly. And this is a real case application, this happens every day. Let me show what CouchDB should return instead, maybe using an option: { "total_rows":3456, "offset":0, "rows":[ {"id":"cd0a4cad-2ff4-4fcf-9a9c-d0d2c6bc6a20","key":"6f4927a6-b891-4e12-9238-1b965d9bb0f4","value":null}, {"id":"3756f01d-cde1-41cb-81a2-33d63d2f1b19","key":"bd0678ee-61c8-4933-a09b-729bc10a4af5","value":null}, {"id":"5e9d6533-2859-48c9-de50-bbc5d6da279a","key":"a6ce26b5-77e1-4a03-a96b-e63e030691a9","value":null}, {"id":null,"key":"86a0e140-cf45-4198-f571-164cdc6bf9a3","value":null}, {"id":null,"key":"6b0dde6a-fba0-4b42-f1b6-9885e3b80e12","value":null}, {"id":null,"key":"adf0fc73-add7-41bb-80ef-f3211b310019","value":null}, {"id":null,"key":"5555bf45-025e-454a-94bf-d14f936854ba","value":null}, {"id":null,"key":"6f5e4c73-4456-425a-c99e-aa8fe2c73637","value":null}, {"id":null,"key":"f92cce76-db3b-4a1e-ecdc-ae0a751e560a","value":null}, {"id":null,"key":"a451bda7-fb9b-4df7-c584-88d5a7ecc1f7","value":null} ] } As you can see above, when a key is not matched the id is null. But the results are in the same order and it's easy to know there is a match just checking if the id is NULL. This is fuckin awesome, because you can write something like this: // Posts. $opts = new ViewQueryOpts(); $opts->doNotReduce()->reverseOrderOfResults(); $opts->setLimit(30); $posts = $this->couch->queryView("posts", "allLatest", NULL, $opts)->getBodyAsArray(); // Extracts the post ids. $keys = []; foreach ($posts["rows"] as $row) $keys[] = $row["id"]; // Stars. $opts->reset(); $opts->doNotReduce(); $opts->returnsNotMatchedKeys; // This is the special option 'returns_not_matched=true'. $stars = $this->couch->queryView("stars", "perItem", $keys, $opts)->getBodyAsArray(); // And wow, you can finally make a for cycle to print your posts. $postsCount = count($posts["rows"]); for ($i = 0; $i < $postCount - 1; $i++) { echo $posts["rows"][$i]["value"]; // The hero name. if (is_null($stars["rows"][$i]["id"])) echo "not starred"; // This means the hero is not starred, there is no 'join'. else echo "starred"; // A document id exist for the key, so yes there is a 'join', the hero has been starred. } > Using multiple keys, the _all_docs built-in view acts differently then a user defined view > ------------------------------------------------------------------------------------------ > > Key: COUCHDB-1868 > URL: https://issues.apache.org/jira/browse/COUCHDB-1868 > Project: CouchDB > Issue Type: Bug > Components: View Server Support > Reporter: Filippo Fadda > > When you query a view using multiple keys, the _all_docs built-in view acts differently then a user defined view: > 1) in the first case CouchDB returns "not_found" for every not found key; > 2) querying a user defined view produces, instead, an empty array. > In the first case you obtain error="not_found" for every key, when you query a user defined view you simply don't get any rows, just the total rows for the view. > See: http://pastebin.com/D7NExJrd > Now, regarding 'keys' the documentation says something like: "Used to retrieve just the view rows matching that set of keys. Rows are returned in the order of the specified keys." > In a normal case, CouchDB should return just a row for each matched key, but it will really help, having an option to return a row for every key, even there if not found, because it's more easy, cycle through results. > Let's suppose the application I'm doing gets the last 30 blog posts, displaying for each one, information that are stored into related documents. The application will query, using as keys the posts' identifiers, other views to get, for example, if a post has been starred from the current logged-in user, etc. > If a view always returns a number of rows equals to the number of keys, the application can cycle from 0 to 29 and display all the related information for a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira