couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Shorin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-2338) Reproduceable document revision hash calculation
Date Mon, 22 Sep 2014 16:00:35 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143338#comment-14143338
] 

Alexander Shorin commented on COUCHDB-2338:
-------------------------------------------

> The "completely random" assertion 
True, I didn't read case clause carefully.

> Reproduceable document revision hash calculation
> ------------------------------------------------
>
>                 Key: COUCHDB-2338
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2338
>             Project: CouchDB
>          Issue Type: Improvement
>      Security Level: public(Regular issues) 
>          Components: Database Core
>            Reporter: Alexander Shorin
>
> Current document revision hash implementation is very Erlang-specific:
> {code}
> new_revid(#doc{body=Body,revs={OldStart,OldRevs},
>         atts=Atts,deleted=Deleted}) ->
>     case [{N, T, M} || #att{name=N,type=T,md5=M} <- Atts, M =/= <<>>]
of
>     Atts2 when length(Atts) =/= length(Atts2) ->
>         % We must have old style non-md5 attachments
>         ?l2b(integer_to_list(couch_util:rand32()));
>     Atts2 ->
>         OldRev = case OldRevs of [] -> 0; [OldRev0|_] -> OldRev0 end,
>         couch_util:md5(term_to_binary([Deleted, OldStart, OldRev, Body, Atts2]))
>     end.
> {code}
> All the bits in code above are trivial for every programming language except {{term_to_binary}}
function implementation: to make it right you need dive deeper into Erlang. I have nothing
against it, Erlang is cool, but this implementation specifics makes whole idea to reproduce
document revision as untrivial complex operation.
> Rationale: you want to build CouchDB compatible storage on different from Erlang technology
stack that will "sync" with CouchDB without worry about non-matched revisions for the same
content with the same modification history done in different "compatible" storages.
> P.S. Oh, yes, if you updates attachmets (add/del) revision becomes completely random.
Moreover, if you just updates attachment for document there is some specific about revision
calculation I don't recall now, but that would be easily notice by looking what the specified
function takes on call.
> P.P.S. via https://twitter.com/janl/status/514019496110333952



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message