couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Bulk Load
Date Sun, 14 Sep 2008 09:58:01 GMT
Hi Ronny,
On Sep 14, 2008, at 11:45, Ronny Hanssen wrote:
> Or have I seriously missed out on some vital information?  Because,  
> based on
> the above I still feel very confused about why we cannot use the  
> built-in
> rev-control mechanism.

You correctly identify that adding revision control to a single node  
instance of
CouchDB is not that hard (a quick search through the archives would  
have told
you, too :-) Making all that work in a distributed environment with  
replication conflict
detection and all is mighty hard. If you can come up with a nice an  
clean solution to
make proper revision control work with CouchDB's replication including  
all the weird
edge cases I don't even know about (aren't I arrogant this  
morning? :), we are happy
to hear about it.

Cheers
Jan
--



>
>
> ~Ronny
>
> 2008/9/14 Jeremy Wall <jwall@google.com>
>
>> Two reasons.
>> * First as I understand it the revisions are not changes between
>> documents.
>> They are actual full copies of the document.
>> * Second revisions get blown away when doing a database compact.  
>> Something
>> you will more than likely want to do since it eats up database  
>> space fairly
>> quickly. (see above for the reason why)
>>
>> That said there is nothing preventing you from storing revisions in
>> CouchDB.
>> You could store a changeset for each document revision is a seperate
>> revision document that accompanies your main document. It would be  
>> really
>> easy and designing views to take advantage of them to show a revision
>> history for you document would be really easy.
>>
>> I suppose you could use the revisions that CouchDB stores but that  
>> wouldn't
>> be very efficient since each one is a complete copy of the  
>> document. And
>> you
>> couldn't depend on that "feature not changing behaviour on you in  
>> later
>> versions since it's not intended for revision history as a feature.
>>
>> On Sat, Sep 13, 2008 at 7:24 PM, Ronny Hanssen <super.ronny@gmail.com
>>> wrote:
>>
>>> Why is the revision control system in couchdb inadequate for, well,
>>> revision
>>> control? I thought that this feature indeed was a feature, not  
>>> just an
>>> internal mechanism for resolving conflicts?
>>> Ronny
>>>
>>> 2008/9/14 Calum Miller <calum_miller@yahoo.com>
>>>
>>>> Hi Chris,
>>>>
>>>> Many thanks for your prompt response.
>>>>
>>>> Storing  a complete new version of each bond/instrument every day  
>>>> seems
>> a
>>>> tad excessive. You can imagine how fast the database will grow  
>>>> overtime
>>> if a
>>>> unique version of each instrument must be saved, rather than just  
>>>> the
>>>> individual changes. This must be a common pattern, not confined to
>>>> investment banking. Any ideas how this pattern can be accommodated
>> within
>>>> CouchDB?
>>>>
>>>> Calum Miller
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Chris Anderson wrote:
>>>>
>>>>> Calum,
>>>>>
>>>>> CouchDB should be easily able to handle this load.
>>>>>
>>>>> Please note that the built-in revision system is not designed for
>>>>> document history. Its sole purpose is to manage conflicting  
>>>>> documents
>>>>> that result from edits done in separate copies of the DB, which  
>>>>> are
>>>>> subsequently replicated into a single DB.
>>>>>
>>>>> If you allow CouchDB to create a new document for each daily  
>>>>> import of
>>>>> each security, and create a view which makes these documents  
>>>>> available
>>>>> by security and date, you should be able to access securities  
>>>>> history
>>>>> fairly simply.
>>>>>
>>>>> Chris
>>>>>
>>>>> On Sat, Sep 13, 2008 at 12:31 PM, Calum Miller <
>> calum_miller@yahoo.com>
>>>>> wrote:
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I trying to evaluate CouchDB for use within investment banking, 

>>>>>> yes
>>> some
>>>>>> of
>>>>>> these banks still exist. I want to load 500,000 bonds into the
>> database
>>>>>> with
>>>>>> each bond containing around 100 fields. I would be looking to  
>>>>>> bulk
>> load
>>> a
>>>>>> similar amount of these bonds every day whilst maintaining a  
>>>>>> history
>>> via
>>>>>> the
>>>>>> revision feature. Are there any bulk load features available for
>>> CouchDB
>>>>>> and
>>>>>> any tips on how to manage regular loads of this volume?
>>>>>>
>>>>>> Many thanks in advance and best of luck with this project.
>>>>>>
>>>>>> Calum Miller
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>


Mime
View raw message