hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-16321) Possible deadlock in metastore with Acid enabled
Date Thu, 20 Apr 2017 17:11:04 GMT

     [ https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eugene Koifman updated HIVE-16321:
----------------------------------
    Attachment: HIVE-16321.05.patch

> Possible deadlock in metastore with Acid enabled
> ------------------------------------------------
>
>                 Key: HIVE-16321
>                 URL: https://issues.apache.org/jira/browse/HIVE-16321
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.3.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Critical
>         Attachments: HIVE-16321.01.patch, HIVE-16321.02.patch, HIVE-16321.03.patch, HIVE-16321.04.patch,
HIVE-16321.05.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can coordinate their
operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool of fixed
size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), where X is >= size
of the pool.  This take all connections form the pool, so when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the pool is
empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  (TxnHandler.checkLock(Connection
dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after enqueueing
the lock with the expectation that the caller will always follow up with a call to checkLock(CheckLockRequest
rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message