Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 85588 invoked from network); 29 Sep 2010 14:12:00 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Sep 2010 14:12:00 -0000 Received: (qmail 2393 invoked by uid 500); 29 Sep 2010 14:12:00 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 1183 invoked by uid 500); 29 Sep 2010 14:11:58 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 1173 invoked by uid 99); 29 Sep 2010 14:11:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Sep 2010 14:11:57 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Sep 2010 14:11:55 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o8TEBXLA019553 for ; Wed, 29 Sep 2010 14:11:33 GMT Message-ID: <2907480.462761285769493431.JavaMail.jira@thor> Date: Wed, 29 Sep 2010 10:11:33 -0400 (EDT) From: "Marcel Reutegger (JIRA)" To: dev@jackrabbit.apache.org Subject: [jira] Commented: (JCR-2760) Use hash codes instead of sequence numbers for string indexes In-Reply-To: <30906963.445381285690113272.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/JCR-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916157#action_12916157 ] Marcel Reutegger commented on JCR-2760: --------------------------------------- +1 looks good to me. > Use hash codes instead of sequence numbers for string indexes > ------------------------------------------------------------- > > Key: JCR-2760 > URL: https://issues.apache.org/jira/browse/JCR-2760 > Project: Jackrabbit Content Repository > Issue Type: Improvement > Components: jackrabbit-core > Reporter: Jukka Zitting > Priority: Minor > Attachments: 0001-JCR-2760-Use-hash-codes-instead-of-sequence-numbers.patch > > > We use index numbers instead of namespace URIs or other strings in many places. The two-way mapping between namespace URIs and index numbers is by default stored in the repository-global ns_idx.properties file, and the index numbers are allocated using a linear sequence. The problem with this approach is that two repositories will easily end up with different string index mappings, which makes it practically impossible to make low-level copies of workspace content across repositories. > The ultimate solution for this problem would be to store the namespace URIs closer to the stored content, ideally as an implementation detail of a persistence manager. > An easier short-term solution would be to decrease the chances of two repositories having different string index mappings. A simple (and backwards-compatible) way to do this is to use the hash code of a namespace URI as the basis of allocating a new index number. Hash collisions are fairly unlikely, and can be handled by incrementing the intial hash code until the collision is avoided. In the common case of no collisions (with a uniform hash function the chance of a collision is less than 1% even with tousands of registered namespaces) this solution allows workspaces to be copied between repositories without worrying about the namespace index mappings. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.