jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Mehrotra <chetan.mehro...@gmail.com>
Subject Re: Strategies around storing blobs in Mongo
Date Wed, 30 Oct 2013 09:30:50 GMT
>  Open questions are, what is the write thoughput for one
> shard, does the write lock also block reads (I guess not), does the write

As Ian mentioned above write locks block all reads. So even adding a 2
MB chunk on a sharded system over remote connection would block read
for that complete duration. So at minimum we should be avoiding that.
Chetan Mehrotra

On Wed, Oct 30, 2013 at 2:40 PM, Ian Boston <ieb@tfd.co.uk> wrote:
> On 30 October 2013 07:55, Thomas Mueller <mueller@adobe.com> wrote:
>> Hi,
>>> as Mongo maintains a global exclusive write locks on a per database level
>> I think this is not necessarily a huge problem. As far as I understand, it
>> limits write concurrency within one shard only, so it does not block
>> scalability. Open questions are, what is the write thoughput for one
>> shard, does the write lock also block reads (I guess not), does the write
>> lock cause high latency for other writes because binaries are big.
> This information would be extremely useful for all those looking to
> Oak to address use cases where the repository access is between 20 and
> 60% write.
> To answer one of your questions
> According to [1] write locks do block reads within the scope of the lock.
> Other information from [1].
> Write locks are exclusive and global.
> Write locks block read locks being established.
> (and obviously read locks block write locks being established)
> Read locks are concurrent and shared.
> Pre 2.2 a write lock was scoped to the mongod process.
> Post 2.2 a write lock is scoped to the database within the mondod process.
> All locks are scoped to a shard.
> IIUC, the lock behaviour is identical to that in JR2 except for the scope.
> Ian
> 1 http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are-locks-in-mongodb
>> I think it would make sense to have a simple benchmark (concurrent writing
>> / reading of binaries), so that we can test which strategy is best, and
>> possibly play around with different strategies (split binaries into
>> smaller / larger chunks, use different write concerns, use more
>> shards,...).
>> Regards,
>> Thomas
>> On 10/30/13 7:50 AM, "Chetan Mehrotra" <chetan.mehrotra@gmail.com> wrote:
>>>Currently we are storing blobs by breaking them into small chunks and
>>>then storing those chunks in MongoDB as part of blobs collection. This
>>>approach would cause issues as Mongo maintains a global exclusive
>>>write locks on a per database level [1]. So even writing multiple
>>>small chunks of say 2 MB each would lead to write lock contention.
>>>Mongo also provides GridFS[2]. However it also uses a similar strategy
>>>like we are currently using and such a support is built into the
>>>Driver. For server they are just collection entries.
>>>So to minimize contentions for write locks for uses cases where big
>>>assets are being stored in Oak we can opt for following strategies
>>>1. Store the blobs collection in a different database. As Mongo write
>>>locks [1] are taken per db level then storing the blobs in different
>>>db would allow the read/write of node data (majority usecase) to
>>>2. For more asset/binary heavy usecase use a separate database server
>>>itself to server the binaries.
>>>3. Bring back the JR2 DataStore implementation and just save metadata
>>>related to binaries in Mongo. We already have S3 based implementation
>>>there and they would continue to work with Oak also
>>>Chetan Mehrotra
>>>[2] http://docs.mongodb.org/manual/core/gridfs/

View raw message