jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: Strategies around storing blobs in Mongo
Date Wed, 30 Oct 2013 07:55:54 GMT
Hi,

> as Mongo maintains a global exclusive write locks on a per database level

I think this is not necessarily a huge problem. As far as I understand, it
limits write concurrency within one shard only, so it does not block
scalability. Open questions are, what is the write thoughput for one
shard, does the write lock also block reads (I guess not), does the write
lock cause high latency for other writes because binaries are big.


I think it would make sense to have a simple benchmark (concurrent writing
/ reading of binaries), so that we can test which strategy is best, and
possibly play around with different strategies (split binaries into
smaller / larger chunks, use different write concerns, use more
shards,...).

Regards,
Thomas



On 10/30/13 7:50 AM, "Chetan Mehrotra" <chetan.mehrotra@gmail.com> wrote:

>Hi,
>
>Currently we are storing blobs by breaking them into small chunks and
>then storing those chunks in MongoDB as part of blobs collection. This
>approach would cause issues as Mongo maintains a global exclusive
>write locks on a per database level [1]. So even writing multiple
>small chunks of say 2 MB each would lead to write lock contention.
>
>Mongo also provides GridFS[2]. However it also uses a similar strategy
>like we are currently using and such a support is built into the
>Driver. For server they are just collection entries.
>
>So to minimize contentions for write locks for uses cases where big
>assets are being stored in Oak we can opt for following strategies
>
>1. Store the blobs collection in a different database. As Mongo write
>locks [1] are taken per db level then storing the blobs in different
>db would allow the read/write of node data (majority usecase) to
>continue.
>
>2. For more asset/binary heavy usecase use a separate database server
>itself to server the binaries.
>
>3. Bring back the JR2 DataStore implementation and just save metadata
>related to binaries in Mongo. We already have S3 based implementation
>there and they would continue to work with Oak also
>
>Chetan Mehrotra
>[1] 
>http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-locks-in-
>mongodb
>[2] http://docs.mongodb.org/manual/core/gridfs/


Mime
View raw message