Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7100E1110E for ; Mon, 22 Sep 2014 16:00:36 +0000 (UTC) Received: (qmail 43075 invoked by uid 500); 22 Sep 2014 16:00:36 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 43007 invoked by uid 500); 22 Sep 2014 16:00:36 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 42995 invoked by uid 99); 22 Sep 2014 16:00:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Sep 2014 16:00:35 +0000 Date: Mon, 22 Sep 2014 16:00:35 +0000 (UTC) From: "Alexander Shorin (JIRA)" To: dev@couchdb.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (COUCHDB-2338) Reproduceable document revision hash calculation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COUCHDB-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143338#comment-14143338 ] Alexander Shorin commented on COUCHDB-2338: ------------------------------------------- > The "completely random" assertion True, I didn't read case clause carefully. > Reproduceable document revision hash calculation > ------------------------------------------------ > > Key: COUCHDB-2338 > URL: https://issues.apache.org/jira/browse/COUCHDB-2338 > Project: CouchDB > Issue Type: Improvement > Security Level: public(Regular issues) > Components: Database Core > Reporter: Alexander Shorin > > Current document revision hash implementation is very Erlang-specific: > {code} > new_revid(#doc{body=Body,revs={OldStart,OldRevs}, > atts=Atts,deleted=Deleted}) -> > case [{N, T, M} || #att{name=N,type=T,md5=M} <- Atts, M =/= <<>>] of > Atts2 when length(Atts) =/= length(Atts2) -> > % We must have old style non-md5 attachments > ?l2b(integer_to_list(couch_util:rand32())); > Atts2 -> > OldRev = case OldRevs of [] -> 0; [OldRev0|_] -> OldRev0 end, > couch_util:md5(term_to_binary([Deleted, OldStart, OldRev, Body, Atts2])) > end. > {code} > All the bits in code above are trivial for every programming language except {{term_to_binary}} function implementation: to make it right you need dive deeper into Erlang. I have nothing against it, Erlang is cool, but this implementation specifics makes whole idea to reproduce document revision as untrivial complex operation. > Rationale: you want to build CouchDB compatible storage on different from Erlang technology stack that will "sync" with CouchDB without worry about non-matched revisions for the same content with the same modification history done in different "compatible" storages. > P.S. Oh, yes, if you updates attachmets (add/del) revision becomes completely random. Moreover, if you just updates attachment for document there is some specific about revision calculation I don't recall now, but that would be easily notice by looking what the specified function takes on call. > P.P.S. via https://twitter.com/janl/status/514019496110333952 -- This message was sent by Atlassian JIRA (v6.3.4#6332)