Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2E9C99632 for ; Tue, 14 Feb 2012 18:07:11 +0000 (UTC) Received: (qmail 86142 invoked by uid 500); 14 Feb 2012 18:07:10 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 86116 invoked by uid 500); 14 Feb 2012 18:07:10 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Delivered-To: moderator for dev@couchdb.apache.org Received: (qmail 94077 invoked by uid 99); 13 Feb 2012 01:01:24 -0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jamesbhayton@gmail.com designates 209.85.212.180 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=RWP8IvkX9G8PXPeYiCwpCg2UzGnQ7bBbLwn4xPFmz9Q=; b=D0ISaYT431BlLKwTzcCRYRPwktkUwRnGYxfyS8prtys1jcN0UovTWaGwHU42xiGMra 46jQaaT3xxTKZpT5aOw+yaU0JBx1AhKA/Z7Uvq/06Ez7TN24Xt1hAITCsbQMrDWvrYcx D8q0bqXQRuE9X7kzsugAOvsPxZszV+o5x/tos= MIME-Version: 1.0 In-Reply-To: References: Date: Sun, 12 Feb 2012 17:00:57 -0800 Message-ID: Subject: Feature Requests/Discussion From: James Hayton To: dev@couchdb.apache.org Content-Type: multipart/alternative; boundary=0016e6d7df59d52eaa04b8ce028c --0016e6d7df59d52eaa04b8ce028c Content-Type: text/plain; charset=ISO-8859-1 Hi Everyone- I have been using CouchDB for several years and I absolutely love working with it most of the time. Thanks to everyone who has made it such a joy to work with. There are however a few consistent situations where I run into trouble and that I would like to fix if possible. I have a few ideas regarding features that would help me design my data model the way I want and would require me to make far less trade offs at the application level. While I think I grok the public facing api fairly well, I don't know the internals at all so I don't know if these features are possible or not. What I would like is for some feedback regarding the possibility of each as well as some sort of feedback regarding difficult, why that feature hasn't been implemented yet, etc... I know some have been discussed before, but haven't been implemented yet so I just want to figure out why and what I can do about it. Anyway, on to my features: *1) Multiple Start & End Keys For CouchDB Views With Group Level Option For Reduce Views* This one has been discussed multiple times before. The JIRA issue is 523 I think. I don't think there is really much debate that this is a must have for CouchDB. There has even been a patch or two. What is stopping this from happening? There hasn't been much discussion on the topic lately. A status update would be great from anyone who has the power to make this happen/get the ball rolling again. This single feature alone would make so many more things possible. *2) Return "No Key" And Empty Row When POSTing Keys To Views Instead Of Nothing If No Key Matches View* The common scenario I have is where I can't get everything in one query from CouchDB, where I am view my data in a list format. Say for example, the category page of a store. I want to display a list or products and each product has the following related documents that need to be called per view row: The products brand doc, the products pricing doc, the products currently availablilty (reduce view row ), any customer specific product documents such as the customer part number, customer specific pricing, etc... So my product doc, looks like this: type: Product brand: id_for_brand_doc_here prices: id_for_prices_doc_here attributes: { hash of attributes here } categories: [array_of_category_ids_here] So, at most I can get the product and one other doc per view row using the linked document feature. This means that if I want to display all the information I want in my application, I have to do multiple lookups per product in the list view. This could easily generate 100's of queries to couch for 1 page view. Multiply this by several requests coming in at the same time at it starts to become a problem. Alternatively, I could issue 4 requests to couch for the entire list by issuing POST request to couch and then zipping the arrays together. (I use Ruby at the application level...) Then, I just have to iterate over the new array one time and make no more requests to couch. The reason this fails is that if you issue a POST request to couch with a key that is not in the view your are posting to, CouchDB doesn't respond with anything for that view row so it would make the array sizes different and therefore make it hard to handle in the client with iterating over the array multiple times. Once to join the data to its proper row and one more time when displaying the information. If CouchDB gave me back the same number of rows as keys I requested I could easily join the arrays together in my application and significantly limit that amount of queries I am sending to couch. For example: Request 1: URL: database/_design/Prodouct/_view/product_with_price?include_docs=true Keys: ["product_id_1", "product_id_2", "product_id_3" ] Now, if my view had the following: if doc.type == product && doc.status == enabled emit (product._id, { name: doc.name, _id: doc.prices } I would get back all 3 products as long as all three were enabled. But if I set a product to disabled it won't show in the view row and therefore couch would return an array of only 2 results, which will make it hard when joining arrays in my application. Request 2: URL: database/_design/Product/_view/by_stock_levels?reduce=true&group=true Keys: ["product_id_1", "product_id_2", "product_id_3" ] Side Note: I can't combine 1&2 to one reduce view even though the key is the same because I get reduce overflow error. Request 3: URL: database/_design/Product/_view/by_customer_part_number Keys: [["product_id_1", "customer_id"], ["product_id_2", "customer_id"], ["product_id_3", "customer_id"] ] If the customer doesn't have a doc that matches this view, couch won't return an empty row, it just won't return the row. Therefore if customer had a matching row for products 1 and 3, and I just zipped the arrays of returned results together, I would get products 3 doc with product 2 in my application. However if couch returns, "no key found" with an empty row, the joining of arrays in my application would still work. Request 4: URL: database/_design/Brand/_view/all Keys: ["brand_id_for_product_id_1", "brand_id_for_product_id_2", "brand_id_for_product_id_3" ] Now, if for some reason, a brand gets deleted and the ID is still on the product, Couch will return and array of rows that did not match the size in my application and it's conceivable that I could get the wrong brand on the wrong product. I could of course check that the array sizes match and only merge if they match and if not, don't merge and make request on the per product basis when displaying results, but it just seems to me that it would be better if couch gave me feedback that no results match for that key i requested. This would save me a ton and certainly make working with couch more relaxing. I could really really use this feature and I don't think it would be very much trouble at all to send a row if no key matches with just something like "key_not_found": null *3) Return Multiple Linked Documents Per View Row* I use the linked documents feature all the time. Really helps me cut down on the number of requests I make. But, I could even further cut down if I was able to get multiple docs back per row if I passed couch and array of ids I wanted with the row instead of just a single id. So, using the example above in #2, lest say I had this view: if doc.type == product && doc.status == enabled emit (product._id, { name: doc.name, _id: doc.prices } But, I also had a brand id stored, that I wanted to get in the same row... lets say I just went like this: if doc.type == product && doc.status == enabled emit (product._id, { name: doc.name, docs: [{_id: doc.prices},{_id: doc.brand}] } couch would respond with docs: [prices_doc, brands_doc] plus my name field from the product doc. I could get most everything I want in one query. I know I can call emit multiple times, but again this just makes everything so much harder in my application because I don't know if there is a brand for every product or not. It essentially forces me to loop through the array multiple times. I could also do collation with reduce, but then I consistently run into reduce overflow errors as this is not what reduce is really designed to handle. There my be some reason why this isn't possible, but I don't know it and I KNOW it would be useful from a users perspective. Combine this with the feature in #2 and I could get everything I want from couch in 2 requests per page view that currently takes me 100 requests for 25 products. *Final Thoughts* Like I said in the beginning, I don't know if some of these are possible or not, but I know that they would make my life as a user of CouchDB much more relaxing. I would sincerely appreciate it if anyone could give feedback on the possibility of each and what we have to do to get moving on these. I appreciate everyone who did taking the time to read this long ass email. I wanted to be clear. If anyone has any other suggestions, please feel free to contact me. Thanks, James Hayton --0016e6d7df59d52eaa04b8ce028c--