Return-Path: X-Original-To: apmail-jackrabbit-oak-dev-archive@minotaur.apache.org Delivered-To: apmail-jackrabbit-oak-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9BBD1101D6 for ; Wed, 30 Oct 2013 06:51:44 +0000 (UTC) Received: (qmail 64258 invoked by uid 500); 30 Oct 2013 06:51:41 -0000 Delivered-To: apmail-jackrabbit-oak-dev-archive@jackrabbit.apache.org Received: (qmail 63784 invoked by uid 500); 30 Oct 2013 06:51:28 -0000 Mailing-List: contact oak-dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: oak-dev@jackrabbit.apache.org Delivered-To: mailing list oak-dev@jackrabbit.apache.org Received: (qmail 63581 invoked by uid 99); 30 Oct 2013 06:51:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Oct 2013 06:51:24 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of chetan.mehrotra@gmail.com designates 209.85.223.180 as permitted sender) Received: from [209.85.223.180] (HELO mail-ie0-f180.google.com) (209.85.223.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Oct 2013 06:51:18 +0000 Received: by mail-ie0-f180.google.com with SMTP id e14so1563938iej.11 for ; Tue, 29 Oct 2013 23:50:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=uSQI76RNDkX0nPwcT2aF/1nDjzh/UHYIudqUdUmrK/8=; b=skWxdYwta86WDkaH6HermZrKfblx8xP+RWt33xGsG+oMp1nrSDqva4inug1X49ZReQ l7FEvdF3380fp6jEaYIxWgLSCX7wGmL4t0WzNTZ70rfd/s30Jogtp69WUQNa2gAdiLv5 gLfLJVEdlsN2D3zrCXwMe4MfgEqeJmhnCT8H3uC/u8QcOxfl60+JmMHIYeoRn819btv5 Umh0hKXGEvmZfVfGjIyYCO72Nn6RvSFl/aD2HIdbO885BoH+pjuFWLbPQ/L2PmlNtKnz wnZ71gaP/kLR7Tx7xVJrgHGmhCTGNrT2NyeVp6gfwb85H4zOVMr4AaTwoUWgrnZ5WgUT lkzA== MIME-Version: 1.0 X-Received: by 10.42.51.144 with SMTP id e16mr2288861icg.2.1383115857462; Tue, 29 Oct 2013 23:50:57 -0700 (PDT) Received: by 10.64.138.138 with HTTP; Tue, 29 Oct 2013 23:50:57 -0700 (PDT) Date: Wed, 30 Oct 2013 12:20:57 +0530 Message-ID: Subject: Strategies around storing blobs in Mongo From: Chetan Mehrotra To: oak-dev@jackrabbit.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Hi, Currently we are storing blobs by breaking them into small chunks and then storing those chunks in MongoDB as part of blobs collection. This approach would cause issues as Mongo maintains a global exclusive write locks on a per database level [1]. So even writing multiple small chunks of say 2 MB each would lead to write lock contention. Mongo also provides GridFS[2]. However it also uses a similar strategy like we are currently using and such a support is built into the Driver. For server they are just collection entries. So to minimize contentions for write locks for uses cases where big assets are being stored in Oak we can opt for following strategies 1. Store the blobs collection in a different database. As Mongo write locks [1] are taken per db level then storing the blobs in different db would allow the read/write of node data (majority usecase) to continue. 2. For more asset/binary heavy usecase use a separate database server itself to server the binaries. 3. Bring back the JR2 DataStore implementation and just save metadata related to binaries in Mongo. We already have S3 based implementation there and they would continue to work with Oak also Chetan Mehrotra [1] http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-locks-in-mongodb [2] http://docs.mongodb.org/manual/core/gridfs/