incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <>
Subject Re: Best practice for view updates across large data sets
Date Thu, 29 Oct 2009 02:24:11 GMT
Hi Seggy, the wiki could maybe be reworded a bit.  There's definitely  
only one .index file on disk for each design document, and when any  
view in that document changes all of the views are rebuilt.

I think what the wiki might also have been trying to get across is  
that, if two views in a design document use a byte-identical map  
function, those views will share the same map results (the results  
won't be duplicated). Internally CouchDB builds a dictionary keyed on  
the source of the map function that keeps a list of reduce functions  
operating on that map output.

Cheers, Adam

On Oct 28, 2009, at 6:44 PM, Seggy Umboh wrote:

> Hmmm ...... I just found that the View API wiki page says otherwise:
> "Each view function is stored according to a hash of their byte
> representation, so it is important that a function does not load any
> additional code, changing its behavior without changing its byte- 
> string."
> I hope the wiki is correct, because that sounds more desirable, but  
> if it's
> not, I'd be happy to fix the wiki.
> On Wed, Oct 28, 2009 at 12:40 PM, Adam Kocoloski  
> <>wrote:
>> Hi Seggy, it's per design document.  Every time you change any view  
>> in a
>> design doc, all the views in that document are reindexed.  Best,
>> Adam
>> On Oct 28, 2009, at 3:09 PM, Seggy Umboh wrote:
>> That's interesting. Is the hash per design document, or per view?  
>> Does it
>>> mean that when I change one view in my _design/development, only  
>>> that view
>>> is reindexed?
>>> On Tue, Oct 27, 2009 at 7:53 PM, Adam Kocoloski  
>>> <>
>>> wrote:
>>> On Oct 27, 2009, at 8:25 PM, Larry wrote:
>>>> As I had expected Im starting to experience lengthy re-indexing  
>>>> times
>>>> when
>>>>> changing/updating our views. We have just over 300K worth of  
>>>>> documents
>>>>> currently and it will be growing. One of our views takes about 20
>>>>> minutes
>>>>> or
>>>>> so to index when installed. This locks up key aspects of our  
>>>>> application
>>>>> and
>>>>> we would like to find a way to keep the application continuously
>>>>> functional.
>>>>> I know that our views scripts can certainly be optimized and thats
>>>>> something
>>>>> were working on as our knowledge and experience with CouchDB  
>>>>> grows.
>>>>> However
>>>>> given where we are now I was wondering if there is a "best  
>>>>> practice" or
>>>>> any
>>>>> tips that users may have on updating views across large data sets.
>>>>> Thanks for the help!
>>>>> larry
>>>> Hi Larry, one trick you may find useful in 0.10 is to take  
>>>> advantage of
>>>> the
>>>> fact that the view index files are identified by the hash of their
>>>> contents.
>>>> This means that you can have your _design/production document and  
>>>> your
>>>> _design/development document, and when you're satisfied with the  
>>>> dev
>>>> version
>>>> of your app and you want to deploy it, you can just update
>>>> _design/production to be identical to _design/development -- your
>>>> production
>>>> system will automatically use the prebuilt indexes from
>>>> _design/development
>>>> with zero downtime.  You can even use HTTP COPY to do this if you  
>>>> like.
>>>> Cheers,
>>>> Adam

View raw message