Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CF2349B12 for ; Mon, 14 Nov 2011 18:53:15 +0000 (UTC) Received: (qmail 62302 invoked by uid 500); 14 Nov 2011 18:53:15 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 62266 invoked by uid 500); 14 Nov 2011 18:53:15 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 62258 invoked by uid 99); 14 Nov 2011 18:53:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Nov 2011 18:53:15 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of alex.besogonov@gmail.com designates 209.85.161.180 as permitted sender) Received: from [209.85.161.180] (HELO mail-gx0-f180.google.com) (209.85.161.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Nov 2011 18:53:08 +0000 Received: by ggnv5 with SMTP id v5so9389914ggn.11 for ; Mon, 14 Nov 2011 10:52:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=LidB5SZHLhOEX6mM5jkjQS4pGLB7iLpKcABmsfBHGwA=; b=EJkOHluBJYzuL6ur5mHNsUYbBjuKoVuyH7GvPvV32RLKjNSUenZObtSPC+/840h4q0 i9Z1RRT3cMbnic+oyQw3MGJnE1WRFXDWU0iOajgmEUht70Wi4ue5w96rvY3LjxISuQfZ LOhmObOVjT1K5OJ02RVgzSSTHY0VNJ95cXvf8= MIME-Version: 1.0 Received: by 10.50.196.193 with SMTP id io1mr24804657igc.3.1321296767908; Mon, 14 Nov 2011 10:52:47 -0800 (PST) Received: by 10.50.242.39 with HTTP; Mon, 14 Nov 2011 10:52:47 -0800 (PST) Date: Mon, 14 Nov 2011 13:52:47 -0500 Message-ID: Subject: Why MD5 is used for hashes, also about non-deterministic IDs. From: Alex Besogonov To: dev@couchdb.apache.org Content-Type: text/plain; charset=UTF-8 I'm looking at CouchDB source code and I have several questions: 1) Why MD5 is used instead of more secure hashes. It's very real to imagine a situation where a malicious user can cause hash collision and cause problems in replication. 2) ID is not completely deterministic - it depends on compression_level and compressible_types settings for attachments. Would it make sense to use MD5 of the original uncompressed document? And while you're at it, it probably makes sense to include file size in Atts2 tuple.