couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Johnson <rob...@rowanshire.net>
Subject Re: Atomic update of multiple attachments?
Date Wed, 09 Feb 2011 00:30:57 GMT
Wayne

Glad I could help

Good luck with your app.

Bob

On 8 Feb 2011, at 23:03, Wayne Conrad wrote:

> Bob, This algorithm looks good.
> 
> My documents can have thousands of attachments totalling 10MB or more. A rusty neuron
in the corner of my mind told me that inline attachments aren't a great idea once they get
that numerous or large.  I don't know if it's being truthful.
> 
> While you were sleeping, I thought about it some more.  For the current use case, I don't
have to actually prevent partial/incomplete updates--detecting them is sufficient.  For that,
I can use a very light-weight relative of your algorithm:
> 
> store a UUID in "eventual_currency_token"
> update attachment
> update attachment
> update attachment
> ...
> store the same UUID in "currency_token"
> 
> When reading, check that currency_token and eventual_currency_token are equal.  If not,
the record is incomplete and should not be used.
> 
> Your algorithm will be good when I finally hit the case where I have to do better than
just detecting a partial update.  Thanks again for your and everyone's help.
> 
> Wayne Conrad
> 
> On 02/08/11 08:48, Robert Johnson wrote:
>> I don't think copy does it because you end up with 2 identical documents including
the attachments, so you would need to copy the document and then immediately update the new
document by removing the attachments and decrementing the counter for attached documents.
As couch will not guarantee that both the copy and update will complete you could end up with
a duplicate document in your database. You have a similar problem with the temp document idea,
and in either case require an identifier that is unique to a document and a version number
within the document (other than the couch generated ones) so you can delete the old version
with certainty that you are deleting the right one.
>> 
>> The chances of a failed update and hence a duplicate document are low if you code
carefully but they exist and you may or may not be happy to take the risk - much will depend
on the impact of the existence of a duplicate.
>> 
>> The best secure way I can come up with relies on unique attachment names and works
as  follows:-
>> 
>> Proceed as per my original suggestion for adding new documents but in addition have
an additional attribute which is in fact an array of attributes per pseudo code below
>> 
>> Attached_docs[]
>> 							Eventual_count
>> 	     					Current_count
>> 							Docs []
>> 
>> When you first load a new document the array will have one element and the eventual
and current counts will be the same as the original count attributes I suggested and the docs
array will have a list of the attached documents that has the same names in it as the system
generated _attachements structure.
>> 
>> Now for update, add a new element to the end of the attached_docs array with the
expected count set to the new number of attachements expected and the actual count set to
zero and the docs array empty. Do not alter the values in eventual_attachement_count or attachement_count_so_far.
Load up your new set of attachements updating only the counts in the new array element and
put the new document names in the docs array as you load them.
>> 
>> In your display code, you would need to look at the last element in the attached_docs
array and if the eventual and actual counts values are the same then the docs array in that
element contains the names of the attachments to show. If the values are different go to the
previous element of the attached_docs array until you find an element that qualifies.
>> 
>> This mechanism is I believe rock solid as
>> 
>> The document will not show up in the view until there is one full set of attachements
loaded, while loading a new set the old set will still be visible and you can identify which
set to show easily, once the new set is fully loaded it becomes immediately visible and there
is no possibility of getting a duplicate document. The only requirement is that you have to
ensure your attachement names are unique within a document but these are in your direct control.
With some careful coding and paranoid checking you are even secure against a failed attachement
upload.
>> 
>> Thats the best I can come up with.
>> 
>> Bob
>> 
>> Robert Newson<robert.newson@gmail.com>  wrote:
>> 
>>> You could also use the COPY feature. :)
>>> 
>>> On Mon, Feb 7, 2011 at 11:59 PM, Wayne Conrad<wayne@databill.com>  wrote:
>>>> Bob, One of my needs is that requestors can get the most recent "complete"
>>>> set of attachments, even while a new set is being assembled.  I've no sense
>>>> of what it takes to work with previous version of a document, esp. since
(as
>>>> I understand it) replication doesn't transfer old revisions of documents.
>>>>  Do you think your idea can be made to work with this need?
>>>> 
>>>> I'm wondering if something can be done that's similar to how we
>>>> create/rename files in Unix.  Can I create a temporary document, load it
up
>>>> with attachments, and then rename it?
>>>> 
>>>> Wayne
>>>> 
>>>> On 02/07/11 16:38, Robert Johnson wrote:
>>>>> 
>>>>> Create your document with attributes "eventual_attachment _count" (set
>>>>> this to the expected count) and "attachment_count_so_far" (set this to
>>>>> zero).
>>>>> As you add each attachment, increment "attachment_count_so_far"
>>>>> 
>>>>> Create a view which only emits when "attachment_count_so_far" =
>>>>> "eventual_attachment _count"
>>>>> 
>>>>> 
>>>>> For update:-
>>>>> 
>>>>> Remove docs and decrement "attachment_count_so_far"
>>>>> Reset "eventual_attachment _count"if necessary
>>>>> Add new attachments and increment "attachment_count_so_far"
>>>>> 
>>>>> Does this work for you?
>>>>> 
>>>>> Bob
>>>>> 
>>>>> On 7 Feb 2011, at 23:25, Wayne Conrad wrote:
>>>>> 
>>>>>> Is there anything I can do to achieve the illusion of atomic update
of a
>>>>>> set of attachments?  Here's the effect I'd like:
>>>>>> 
>>>>>> For create:
>>>>>> 1. Create a document.
>>>>>> 2. Add attachments to it.
>>>>>> 3. Only now does the document and all of its attachments become visible.
>>>>>> 
>>>>>> For update:
>>>>>> 1. Delete all of the document's attachments.
>>>>>> 2. Add a new set of attachments to the document.
>>>>>> 3. Only now does the new set of attachments appear to replace the
old.
>>>>>> 
>>>>>> I'm using Couchdb 1.0.2 and CouchRest 1.0.1.  I'm not opposed to
cheating
>>>>>> to achieve my goal.  Suggestions of "Did you think of doing this-other-thing
>>>>>> instead?" are also welcome.
>>>>> 
>>>> 
>>>> 
> 


Mime
View raw message