hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Kuzmenko <f1she...@gmail.com>
Subject Re: Hive TxnHandler::lock method run into dead lock.
Date Tue, 28 Mar 2017 11:03:42 GMT
Explicit configuration is workaround, but it doesn't solve deadlock problem.

On Mon, Mar 27, 2017 at 8:28 PM, Eugene Koifman <ekoifman@hortonworks.com>
wrote:

> There is an open ticket
>
> https://issues.apache.org/jira/browse/HIVE-13842
>
>
>
> Eugene
>
>
>
> *From: *Igor Kuzmenko <f1sherox@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Monday, March 27, 2017 at 8:39 AM
>
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive TxnHandler::lock method run into dead lock.
>
>
>
> I increased maxConnections up to 50 and recompiled metastore jar. Its
> gonna be enough for a while.
>
> *Eugene, *do you know is there any Jira item on this problem?
>
>
>
> On Sun, Mar 26, 2017 at 7:21 PM, Eugene Koifman <ekoifman@hortonworks.com>
> wrote:
>
> I see (in HikariCP-2.5.1.jar) so perhaps upgrading the library is an
> option.
>
>
>
> *public *HikariConfig() {
>   *this*.dataSourceProperties = *new *Properties();
>   *this*.healthCheckProperties = *new *Properties();
>   *this*.minIdle = -1;
>   *this*.maxPoolSize = -1;
>   *this*.maxLifetime = MAX_LIFETIME;
>   *this*.connectionTimeout = CONNECTION_TIMEOUT;
>   *this*.validationTimeout = VALIDATION_TIMEOUT;
>   *this*.idleTimeout = IDLE_TIMEOUT;
>   *this*.isAutoCommit = *true*;
>   *this*.isInitializationFailFast = *true*;
>   String systemProp = System.getProperty(*"hikaricp.configurationFile"*);
>   *if*(systemProp != *null*) {
>     *this*.loadProperties(systemProp);
>   }
>
> }
>
>
>
>
>
> *From: *Igor Kuzmenko <f1sherox@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Saturday, March 25, 2017 at 5:05 PM
>
>
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive TxnHandler::lock method run into dead lock.
>
>
>
> Hi, Eugene.
>
> I've tried hicaricp, it didn't work either. Hikari pool has same 10
> maxConnections limit. When creating pool there's no explicit set of max
> pool size and in HikariConfig constructor it's hardcoded to value of 10:
>
> public HikariConfig()
> {
>    dataSourceProperties = new Properties();
>
>    connectionTimeout = *CONNECTION_TIMEOUT*;
>    idleTimeout = *IDLE_TIMEOUT*;
>    isAutoCommit = true;
>    isJdbc4connectionTest = true;
>    minIdle = -1;
>    maxPoolSize = 10;
>    maxLifetime = *MAX_LIFETIME*;
>    isRecordMetrics = false;
>    transactionIsolation = -1;
>    metricsTrackerClassName = "com.zaxxer.hikari.metrics.CodaHaleMetricsTracker";
>    customizer = new IConnectionCustomizer() {
>       @Override
>       public void customize(Connection connection) throws SQLException
>       {
>       }
>    };
> }
>
>
>
> On Thu, Mar 23, 2017 at 8:29 PM, Eugene Koifman <ekoifman@hortonworks.com>
> wrote:
>
> Can you try use “hikaricp” connection pool manager?  It seems to be using
> default which is no limit.
>
>
>
>
>
> Eugene
>
>
>
> *From: *Igor Kuzmenko <f1sherox@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Monday, March 20, 2017 at 2:17 PM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive TxnHandler::lock method run into dead lock.
>
>
>
> Sorry miss clicked.
>
>
>
> 2) TxnHandler::lock method request new connection when executing this line
> of code:
>
> ConnectionLockIdPair
>
> connAndLockId =
>
> enqueueLockWithRetry(rqst);
>
>
>
> 3) After that
>
> folowing this
>
> stacktrace:
>
> - TxnHandler::lock
>
> -
>
> TxnHandler::checkLockWithRetry
>
> -
>
> TxnHandler::checkLock
>
>
>
> In
>
> checkLock method we reach this line:
>
> handle
>
> =
>
> getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
>
>
>
> 4)
>
> acquireLock method requests another connection to DB:
>
> dbConn
>
> =
>
> getDbConn(Connection.TRANSACTION_READ_COMMITTED);
>
>
>
> So all in all if i call
>
> TxnHandler::lock
>
> method in 10 threads same time at first I get all connections to db, that
> stored in pool, and at
>
> acquireLock I will stuck because there's no free connection.
>
>
>
> Does anyone run into this problem? How can I avoid this problem?
>
>
>
> Code was taken from here:
>
> https://github.com/hortonworks/hive-release/blob/
> HDP-2.5.0.0-tag/metastore/src/java/org/apache/hadoop/hive/
> metastore/txn/TxnHandler.java
>
>
>
> I guess the closest branch in
>
> apach
>
> repo is:
>
> https://github.com/apache/hive/blob/branch-2.1/
> metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>
>
>
> On Tue, Mar 21, 2017 at 12:07 AM, Igor Kuzmenko <f1sherox@gmail.com>
> wrote:
>
> Hello I'm running Hortonworks data platform 2.5.0.0 with included hive.
>
> I'm using storm hive bolt to load data into my hive. But launching many
> hive bolt always leads me to TimeoutException on calling hive metastore.
> Metastore logs full of Exception like this:
>
>
>
> 2017-03-15 18:46:12,436 ERROR [pool-5-thread-11]: txn.TxnHandler
> (TxnHandler.java:getDbConn(1834)) - There is a problem with a connection
> from the pool, retrying(rc=7): Timed out waiting for a free available
> connection. (SQLState=08001, ErrorCode=0)
> java.sql.SQLException: Timed out waiting for a free available connection.
> at com.jolbox.bonecp.DefaultConnectionStrategy.getConnectionInternal(
> DefaultConnectionStrategy.java:88)
> at com.jolbox.bonecp.AbstractConnectionStrategy.getConnection(
> AbstractConnectionStrategy.java:90)
> at com.jolbox.bonecp.BoneCP.getConnection(BoneCP.java:553)
> at com.jolbox.bonecp.BoneCPDataSource.getConnection(
> BoneCPDataSource.java:131)
> at org.apache.hadoop.hive.metastore.txn.TxnHandler.
> getDbConn(TxnHandler.java:1827)
> at org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(
> TxnHandler.java:873)
> at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(
> TxnHandler.java:814)
> at org.apache.hadoop.hive.metastore.HiveMetaStore$
> HMSHandler.lock(HiveMetaStore.java:5751)
>
>
>
> After looking through code I found out
>
>
>
> 1) TxnHandler  class uses connection pool to get db connections and it's
> size is 10.
>
> 2) TxnHandler::lock method requset new connection whe executing this line
> of code:
>
>
>
>
>
>
>
>
>

Mime
View raw message