couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Shorin (JIRA)" <>
Subject [jira] [Created] (COUCHDB-2338) Reproduceable document revision hash calculation
Date Mon, 22 Sep 2014 14:48:35 GMT
Alexander Shorin created COUCHDB-2338:

             Summary: Reproduceable document revision hash calculation
                 Key: COUCHDB-2338
             Project: CouchDB
          Issue Type: Improvement
      Security Level: public (Regular issues)
          Components: Database Core
            Reporter: Alexander Shorin

Current document revision hash implementation is very Erlang-specific:
        atts=Atts,deleted=Deleted}) ->
    case [{N, T, M} || #att{name=N,type=T,md5=M} <- Atts, M =/= <<>>] of
    Atts2 when length(Atts) =/= length(Atts2) ->
        % We must have old style non-md5 attachments
    Atts2 ->
        OldRev = case OldRevs of [] -> 0; [OldRev0|_] -> OldRev0 end,
        couch_util:md5(term_to_binary([Deleted, OldStart, OldRev, Body, Atts2]))

All the bits in code above are trivial for every programming language except {{term_to_binary}}
function implementation: to make it right you need dive deeper into Erlang. I have nothing
against it, Erlang is cool, but this implementation specifics makes whole idea to reproduce
document revision as untrivial complex operation.

Rationale: you want to build CouchDB compatible storage on different from Erlang technology
stack that will "sync" with CouchDB without worry about non-matched revisions for the same
content with the same modification history done in different "compatible" storages.

P.S. Oh, yes, if you updates attachmets (add/del) revision becomes completely random. Moreover,
if you just updates attachment for document there is some specific about revision calculation
I don't recall now, but that would be easily notice by looking what the specified function
takes on call.

P.P.S. via

This message was sent by Atlassian JIRA

View raw message