cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Min Chen <min.c...@citrix.com>
Subject Re: race conditions in VolumeServiceImpl.createBaseImageAsync() creates NPE
Date Fri, 01 Nov 2013 00:29:34 GMT
Darren, I just checked the code, you are right. In case of one thread throws exception in downloading
template to primary, it will delete the entry in template_store_ref, causing the second thread
failing with NPE. We need to fix this in 4.3. Please file a bug for this.

Thanks
-min

From: Darren Shepherd <darren.s.shepherd@gmail.com<mailto:darren.s.shepherd@gmail.com>>
Date: Thursday, October 31, 2013 11:39 AM
To: "dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org>" <dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org>>,
Min Chen <min.chen@citrix.com<mailto:min.chen@citrix.com>>
Subject: race conditions in VolumeServiceImpl.createBaseImageAsync() creates NPE

The following code results in a NPE in bad situations

        templatePoolRef = _tmpltPoolDao.acquireInLockTable(templatePoolRefId, storagePoolMaxWaitSeconds);

        if (templatePoolRef == null) {
            if (s_logger.isDebugEnabled()) {
                s_logger.info<http://s_logger.info>("Unable to acquire lock on VMTemplateStoragePool
" + templatePoolRefId);
            }
            templatePoolRef = _tmpltPoolDao.findByPoolTemplate(dataStore.getId(), template.getId());
            if (templatePoolRef.getState() == ObjectInDataStoreStateMachine.State.Ready )
{
                s_logger.info<http://s_logger.info>("Unable to acquire lock on VMTemplateStoragePool
" + templatePoolRefId + ", But Template " + template.getUniqueName() + " is already copied
to primary storage, skip copying");
                createVolumeFromBaseImageAsync(volume, templateOnPrimaryStoreObj, dataStore,
future);
                return;
            }
            throw new CloudRuntimeException("Unable to acquire lock on VMTemplateStoragePool:
" + templatePoolRefId);
        }

If two threads are trying to stage the same template thread one gets the lock, thread two
will wait.  If thread one fails to stage the template it will delete the templatePoolRef from
the database.  Thread two will now get the lock in op_lock, but the internal findById will
not find a templatePoolRef because it has been deleted and return null from acquireInLockTable().
 Technically thread two has the lock, but the ref templatePoolRef wasn't found.  The subsequent
line "templatePoolRef = _tmpltPoolDao.findByPoolTemplate(...)" will return null, because it
doesn't exist and then on the next line templatePoolRef.getState() will throw a NPE.

Darren

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message