hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-16321) Possible deadlock in metastore with Acid workload
Date Wed, 12 Apr 2017 17:36:41 GMT

     [ https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eugene Koifman updated HIVE-16321:
----------------------------------
    Description: 
TxnStore.MutexAPI is a mechanism how different Metastore instances can coordinate their operations.
 It uses a JDBCConnection to achieve it.

In some cases this may lead to deadlock.  TxnHandler uses a connection pool of fixed size.
 Suppose you X simultaneous calls to  TxnHandler.lock(), where X is >= size of the pool.
 This take all connections form the pool, so when
{noformat}
handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
{noformat} 
is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the pool is empty
and the system is deadlocked.

MutexAPI can't use the same connection as the operation it's protecting.  (TxnHandler.checkLock(Connection
dbConn, long extLockId) is an example).

We could make MutexAPI use a separate connection pool (size > 'primary' conn pool).

Or we could make TxnHandler.lock(LockRequest rqst) return immediately after enqueueing the
lock with the expectation that the caller will always follow up with a call to checkLock(CheckLockRequest
rqst).

cc [~f1sherox]



  was:
TxnStore.MutexAPI is a mechanism how different Metastore instances can coordinate their operations.
 It uses a JDBCConnection to achieve it.

In some cases this may lead to deadlock.  TxnHandler uses a connection pool of fixed size.
 Suppose you X simultaneous calls to  TxnHandlerlock(), where X is >= size of the pool.
 This take all connections form the pool, so when
{noformat}
handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
{noformat} 
is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the pool is empty
and the system is deadlocked.

MutexAPI can't use the same connection as the operation it's protecting.  (TxnHandler.checkLock(Connection
dbConn, long extLockId) is an example).

We could make MutexAPI use a separate connection pool (size > 'primary' conn pool).

Or we could make TxnHandler.lock(LockRequest rqst) return immediately after enqueueing the
lock with the expectation that the caller will always follow up with a call to checkLock(CheckLockRequest
rqst).

cc [~f1sherox]




> Possible deadlock in metastore with Acid workload
> -------------------------------------------------
>
>                 Key: HIVE-16321
>                 URL: https://issues.apache.org/jira/browse/HIVE-16321
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.3.0
>            Reporter: Eugene Koifman
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can coordinate their
operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool of fixed
size.  Suppose you X simultaneous calls to  TxnHandler.lock(), where X is >= size of the
pool.  This take all connections form the pool, so when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the pool is
empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  (TxnHandler.checkLock(Connection
dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after enqueueing
the lock with the expectation that the caller will always follow up with a call to checkLock(CheckLockRequest
rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message