cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiroki Ohashi" <hiroki.s...@gmail.com>
Subject Re: Review Request 26263: CLOUDSTACK-7539: [S3] Parallel deployment makes reference count of a cache in nfs secondary staging store negative(-1)
Date Fri, 10 Oct 2014 07:12:51 GMT


> On Oct. 8, 2014, 7:43 p.m., Rohit Yadav wrote:
> > engine/storage/cache/src/org/apache/cloudstack/storage/cache/manager/StorageCacheManagerImpl.java,
line 228
> > <https://reviews.apache.org/r/26263/diff/1/?file=710568#file710568line228>
> >
> >     Code looks alright, do you need to handle case when one or both of data and
store is null? We should be able to apply it on master after Edison's and Min's review.

Maybe, I think there is no problem without null checks in current call
stack because those arguments is used at methods called previously.
However, it looks good to insert null checks to make sure that valid
objects are used in createCacheObject().


- Hiroki


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26263/#review55782
-----------------------------------------------------------


On Oct. 3, 2014, 5:54 p.m., Hiroki Ohashi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26263/
> -----------------------------------------------------------
> 
> (Updated Oct. 3, 2014, 5:54 p.m.)
> 
> 
> Review request for cloudstack, edison su and Min Chen.
> 
> 
> Bugs: CLOUDSTACK-7539
>     https://issues.apache.org/jira/browse/CLOUDSTACK-7539
> 
> 
> Repository: cloudstack-git
> 
> 
> Description
> -------
> 
> Root causes of CLOUDSTACK-7539 are below.
> 
> a. createCacheObject() method lacks mutual exclusion. After calling
>    this method, threads check cache existence but the procedure is not
>    atomic. So, other threads may call this method and decide to create
>    a cache in spite of cache copy being in progress.
> 
> b. createCacheObject() method ignores states except READY when it
>    checks cache existence. Then, it is considered cache doesn't exist
>    in spite of cache copy being in progress.
> 
> c. When multiple cache entries about a same image are in a database,
>    threads may get a wrong cache entry because of incorrect search
>    condition that randomly chooses one entry from entries that have
>    same template id. As a result of this retrieval, reference counter
>    of the cache probably go negative. (I'm not sure about this
>    behavior but I suspect CloudStack behaves like this.)
> 
> These behaviors make cache state incorrect and inconsistent.
> 
> For fixing this bug, I modified cache creation behavior when no cache
> is on NFS Secondary Staging Store(SSS). As sequential deployment
> functions so far, I think parallel deployment works well when it is
> managed like sequential deployment.
> 
> Fixed points are below. These are for cause a. and b., not including
> fix for cause c.
> 
> 1. Protect critical section between cache search and allocation by lock.
> 
>    When management server threads reach createCacheObject() method,
>    threads get lock at first and check cache existence. In case that no
>    cache is on NFS SSS, one thread sends a request for creating cache.
>    In other case, threads wait for completion of creating cache not to
>    duplicate same image.
> 
> 2. Add several condition check for cache state.
> 
>    If a thread is creating a cache of an image, other threads mustn't
>    send no additional request for the same image. Therefore, threads have
>    to consider whether a copy request should be issued by those state
>    check.
> 
> 
> Diffs
> -----
> 
>   engine/storage/cache/src/org/apache/cloudstack/storage/cache/manager/StorageCacheManagerImpl.java
7c74729 
> 
> Diff: https://reviews.apache.org/r/26263/diff/
> 
> 
> Testing
> -------
> 
> I tested deployment of two instances at almost the same time. Results
> are below.
> 
> - Succeed in deployment of two instances.
> - No database cache entries are duplicated.
> - No duplicated images are on NFS SSS.
> - Reference counter never goes to -1.
> 
> 
> Thanks,
> 
> Hiroki Ohashi
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message