couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sho Fukamachi <sho.fukama...@gmail.com>
Subject Re: drill into a doc with a GET?
Date Thu, 08 Jan 2009 03:13:25 GMT

On 08/01/2009, at 1:24 PM, Paul Davis wrote:

> [..]
> function(doc)
> {
>    for(var field in doc)
>   {
>       emit([field, doc._id], doc[field]);
>   }
> }

I call that an "exploded index" and worry somewhat about its storage  
usage.

Two concerns:

- you'd be needlessly re-storing the large data that the OP wanted to  
avoid transferring. Presumably it's big. You might be able to manually  
exclude it if it always has the same name, of course
- if there's a lot of records with a lot of small fields, index  
overhead might double or even triple the database size

An alternative might be to take the reverse to that approach, and  
write a view which returned all the field except the large entry (if  
known) you're trying to avoid transferring. That way, you'd avoid  
having to re-store those large fields in the index as well.

Storage is cheap*, but obviously it would be bad practise to  
needlessly double (or worse) the database size.

I have often wondered the exact overhead of a row in a view index.  
Obviously, if it's more than a few bytes, it's going to be a factor to  
consider when contemplating view index strategies which generate an  
awful lot of index rows. If there are a large number of fields with a  
small amount of data in each, and a large number of documents, it is  
quite plausible the "exploded index" could be several times the  
original size of the data.

Anyone with inside knowledge want to chip in on that? What would be  
the approximate overhead, per-entry, of an exploded view index as  
described by Paul? Or maybe I should just test it, since I've been  
wondering about that for a while ...


Sho

* good storage is not actually cheap



> Then to access a specific property:
>
> http://127.0.0.1:5984/db_name/_view/view_name/by_property? 
> key=["foo", "docid1"]
>
> HTH,
> Paul Davis
>
> On Wed, Jan 7, 2009 at 4:18 PM, Robert Koberg <rob@koberg.com> wrote:
>> Hi,
>>
>> first, couchdb is just beautiful! :) (using 0.8.1-incubating from  
>> MacPorts)
>>
>> I am very new, and have read the available docs and several blog  
>> posts.
>>
>> Can you drill into a doc with a simple GET?
>>
>> Say I have a doc like:
>>
>> {"_id": "a", "_rev": "123", "foo":{"bar": 1}, "big-ass-prop": "huge  
>> amount
>> of stuff"}
>>
>> Ideally I would like to be able to call something like:
>>
>> http://127.0.0.1:5984/mydb/a/foo
>>
>> to return {"bar":1} and avoid downloading "big-ass-prop"
>>
>> Is this or something like it possible?
>>
>> (I realize "foo" is a 'sibling' of the _id in the document, but it is
>> probably treated more like a parent in the DB?)
>>
>> If not possible, is it possible to create some kind of default
>> action/filter/? that does something like the above? That is, reads  
>> the
>> request uri, recognizes it is a document and that there is extra  
>> path info
>> which should be used to resolve a property.
>>
>> thanks,
>> -Rob
>>


Mime
View raw message