Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3D592200B6F for ; Wed, 24 Aug 2016 12:32:40 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 38B35160AB1; Wed, 24 Aug 2016 10:32:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3C218160AA4 for ; Wed, 24 Aug 2016 12:32:34 +0200 (CEST) Received: (qmail 93113 invoked by uid 500); 24 Aug 2016 10:32:28 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 93102 invoked by uid 99); 24 Aug 2016 10:32:27 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Aug 2016 10:32:27 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 85194C519E for ; Wed, 24 Aug 2016 10:32:27 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id WDZXAIa7zZ5T for ; Wed, 24 Aug 2016 10:32:25 +0000 (UTC) Received: from mail-it0-f49.google.com (mail-it0-f49.google.com [209.85.214.49]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id A682E60E19 for ; Wed, 24 Aug 2016 10:32:24 +0000 (UTC) Received: by mail-it0-f49.google.com with SMTP id n128so25827882ith.1 for ; Wed, 24 Aug 2016 03:32:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=qO4JBh1o6Wg+gCYHm8TW9C/zRrpprI2D+wYXTGuXI0w=; b=KrLMrZSQj40qrnxs7LENQbnudXnJKj8gCrTwQiMzCbktYmrUl9FOQbiiqv+JoLamlM eTmHwToNFE6cZzQR20ivrW6Dj5mK0PgqwFvp62ErBgjGR8jChh+vsD12muX3Yo9BJpXv 7x6FNOcuFkiJeflKgFL4xz5xU8xKEK1A5XmFQi6mspHLNUs3ETrkPn6VKFYx1xL+mLVR R/kD4jyLmdRNgBQPA2Y5NipYJ2XY5+xu/RQrAs0ibWrlrt9H9Gy3XOcnI2CXMzS76yK4 uvNYa9mdwOP3RLu+/HNnTtSk28NmkPhACbKHaWhV4O7lQWTBAepcPzjEVIs1zOTtF+Yz vh+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=qO4JBh1o6Wg+gCYHm8TW9C/zRrpprI2D+wYXTGuXI0w=; b=UN4LTuw/oyw8BQlZhCkYhIASBVK9Qq3BMfl52C4XVLeldEs9gtB6IPM9qME/+NJrxq NgOXiFfkWyd6evxEKHeVtwwYxgoiaoSrgKpxCHmgo91hpN1h53FFRJPsSYuyXCNiVqyJ 74XGiyfJVB66TCf6qATmVtXNNMp9DAK4oW6DNVcXgvnDHxkXiECo4FvuC2PQeFisge/h 8S4kE3F9ZgwRTWRHUfqGppKj20FLI1Ovm7OUoitMXVwtpL4+e4yvV9kOsMkZuGORdSIC mApBRKZ7jlMr+PDgQtr/FGt7qvlqXGLxL/hBVGHJlYRgEbVurDs/8Dhd437BdQS77Gh6 TLDQ== X-Gm-Message-State: AEkoousWbb7bxycSQji3zfaIEe/HeX4pqFTBV5/9NIjjiKmgV9tMVlG09/6KtlE5Otdz8xDNTtzi15Q6V3Ip0A== X-Received: by 10.107.62.68 with SMTP id l65mr3231280ioa.185.1472034743064; Wed, 24 Aug 2016 03:32:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.191.69 with HTTP; Wed, 24 Aug 2016 03:32:02 -0700 (PDT) In-Reply-To: References: From: Igor Kuzmenko Date: Wed, 24 Aug 2016 13:32:02 +0300 Message-ID: Subject: Re: Hive transaction doesn't release lock. To: user@hive.apache.org Content-Type: multipart/alternative; boundary=94eb2c06146af51ea4053acecb70 archived-at: Wed, 24 Aug 2016 10:32:40 -0000 --94eb2c06146af51ea4053acecb70 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Oh, my copy paste skills failed me. Here's output from HIVE_LOCKS: # HL_LOCK_EXT_ID, HL_LOCK_INT_ID, HL_TXNID, HL_DB, HL_TABLE, HL_PARTITION, HL_LOCK_STATE, HL_LOCK_TYPE, HL_LAST_HEARTBEAT, HL_ACQUIRED_AT, HL_USER, HL_HOST '8496355', '1', '78461824', 'default', 'data_http', 'dt=3D20160821', 'a', 'r', '0', '1471768489000', 'storm', 'sorm-data03.msk.mts.ru' TXNS, TXN_COMPONENTS: # TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST, TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION 78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru, 78461824, default, data_http, dt=3D201608213 Good news, I've found a log when it all started: Here comes heartbeat: 2016-08-21 11:34:48,934 DEBUG [pool-3-thread-155]: txn.TxnHandler (TxnHandler.java:heartbeatTxn(1882)) - Going to execute update Concurrently we are trying to abort transaction 2016-08-21 11:34:48,907 DEBUG [DeadTxnReaper-0]: txn.TxnHandler (TxnHandler.java:abortTxns(1429)) - Going to execute update 2016-08-21 11:34:48,943 DEBUG [DeadTxnReaper-0]: txn.TxnHandler (TxnHandler.java:abortTxns(1446)) - Going to execute update 2016-08-21 11:34:48,957 DEBUG [pool-3-thread-11]: txn.TxnHandler (TxnHandler.java:abortTxns(1429)) - Going to execute update 2016-08-21 11:34:48,979 DEBUG [pool-3-thread-11]: txn.TxnHandler (TxnHandler.java:abortTxns(1446)) - Going to execute update At this point transaction aborted and thers no lock. But heartbeat thread still working and: 2016-08-21 11:34:49,025 DEBUG [pool-3-thread-155]: txn.TxnHandler (TxnHandler.java:lock(1546)) - Going to execute update 2016-08-21 11:34:49,027 DEBUG [pool-3-thread-155]: txn.TxnHandler (TxnHandler.java:lock(1576)) - Going to execute update 2016-08-21 11:34:49,029 DEBUG [pool-3-thread-155]: txn.TxnHandler (TxnHandler.java:checkLock(1696)) - lockid:8496355 intLockId:1 txnid:78461824 db:default table:data_http partition:dt=3D20160821 state:WAITING type:SHARED_READ So I guess, that it's race condition between heartbeat thread and TxnReaper thread. Last heartbeat information in HIVE_LOCKS table differs from TXNS table. Full log here . On Tue, Aug 23, 2016 at 8:20 PM, Eugene Koifman wrote: > your query is =E2=80=9Cselect * from HIVE_LOCKS=E2=80=9D but the output i= s not from > HIVE_LOCKS. > What entries do you have in HIVE_LOCKS for this txn_id? > > If all you see is an entry in TXN table in =E2=80=98a=E2=80=99 state =E2= =80=93 that is OK. that > just mean that this transaction was aborted. > > Eugene > > From: Igor Kuzmenko > Reply-To: "user@hive.apache.org" > Date: Monday, August 22, 2016 at 8:27 AM > To: "user@hive.apache.org" > Subject: Hive transaction doesn't release lock. > > Hello, I'm using Apache Hive 1.2.1 and Apache Storm to stream data in hiv= e > table. > After making some tests I tried to truncate my table, but sql execution > doesn't complete because of the lock on table: > > > select * from HIVE_LOCKS; > > > # TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST, > TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION > '78461824', 'a', '1471762974000', '1471768488000', 'storm', 'sorm-data03. > msk.mts.ru', '78461824', 'default', 'data_http', 'dt=3D20160821' > > > > At the same time: > > select * from TXNS, TXN_COMPONENTS > where 1=3D1 > and TXNS.TXN_ID =3D TXN_COMPONENTS.TC_TXNID > and TXNS.TXN_ID =3D 78461824 > ; > > > # TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST, > TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION > 78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru, > 78461824, default, data_http, dt=3D20160821 > > > Txn 78461824 is in aborted state, but still keep lock on table > 'data_http'. How can I avoid this? What should I do to get rid of that > lock? > > --94eb2c06146af51ea4053acecb70 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Oh, my copy paste skills failed me.
Here's output = from HIVE_LOCKS:

# HL_LOCK_EXT_ID, HL_LOCK_INT_ID, HL_TXNID, HL_DB, = HL_TABLE, HL_PARTITION, HL_LOCK_STATE, HL_LOCK_TYPE, HL_LAST_HEARTBEAT, HL_= ACQUIRED_AT, HL_USER, HL_HOST
'8496355', '1', '78461= 824', 'default', 'data_http', 'dt=3D20160821', = 'a', 'r', '0', '1471768489000', 'storm&= #39;, 'sorm-data03.msk.mts.ru= '

TXNS, TXN_COMPONENTS:
# TXN_ID, TXN_STATE, TXN_STARTED,= TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST, TC_TXNID, TC_DATABASE, TC_TABLE, T= C_PARTITION
78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru, 78461824, defau= lt, data_http, dt=3D201608213


Good news, I've found a log wh= en it all started:

Here comes heartbeat:
2016-08-21 11:34:48,934= DEBUG [pool-3-thread-155]: txn.TxnHandler (TxnHandler.java:heartbeatTxn(18= 82)) - Going to execute update <update TXNS set txn_last_heartbeat =3D 1= 471768488000 where txn_id =3D 78461824 and txn_state =3D 'o'>
Concurrently we are trying to abort transaction
2016-08-21 11:34:48= ,907 DEBUG [DeadTxnReaper-0]: txn.TxnHandler (TxnHandler.java:abortTxns(142= 9)) - Going to execute update <delete from HIVE_LOCKS where hl_txnid in = (78461824,78461825,78461826,78461827,78461828,78461829)>
2016-08-21 1= 1:34:48,943 DEBUG [DeadTxnReaper-0]: txn.TxnHandler (TxnHandler.java:abortT= xns(1446)) - Going to execute update <update TXNS set txn_state =3D '= ;a' where txn_state =3D 'o' and txn_id in (78461824,78461825,78= 461826,78461827,78461828,78461829) and txn_last_heartbeat < 147176423300= 0>
2016-08-21 11:34:48,957 DEBUG [pool-3-thread-11]: txn.TxnHandler (= TxnHandler.java:abortTxns(1429)) - Going to execute update <delete from = HIVE_LOCKS where hl_txnid in (78461824)>
2016-08-21 11:34:48,979 DEBU= G [pool-3-thread-11]: txn.TxnHandler (TxnHandler.java:abortTxns(1446)) - Go= ing to execute update <update TXNS set txn_state =3D 'a' where t= xn_state =3D 'o' and txn_id in (78461824)>

At this point = transaction aborted and thers no lock.
But heartbeat thread still workin= g and:
2016-08-21 11:34:49,025 DEBUG [pool-3-thread-155]: txn.TxnHandler= (TxnHandler.java:lock(1546)) - Going to execute update <insert into TXN= _COMPONENTS (tc_txnid, tc_database, tc_table, tc_partition) values (7846182= 4, 'default', 'data_http', 'dt=3D20160821')>
= 2016-08-21 11:34:49,027 DEBUG [pool-3-thread-155]: txn.TxnHandler (TxnHandl= er.java:lock(1576)) - Going to execute update <insert into HIVE_LOCKS = =C2=A0(hl_lock_ext_id, hl_lock_int_id, hl_txnid, hl_db, hl_table, hl_partit= ion, hl_lock_state, hl_lock_type, hl_last_heartbeat, hl_user, hl_host) valu= es (8496355, 1,78461824, 'default', 'data_http', 'dt=3D= 20160821', 'w', 'r', 0, 'storm', 'sorm-data03.msk.mts.ru')>
= 2016-08-21 11:34:49,029 DEBUG [pool-3-thread-155]: txn.TxnHandler (TxnHandl= er.java:checkLock(1696)) - =C2=A0 lockid:8496355 intLockId:1 txnid:78461824= db:default table:data_http partition:dt=3D20160821 state:WAITING type:SHAR= ED_READ


So I guess, that it's race condition betw= een heartbeat thread and TxnReaper thread. Last heartbeat information in HI= VE_LOCKS table differs from TXNS table.
Full log here.=C2=A0=

On Tu= e, Aug 23, 2016 at 8:20 PM, Eugene Koifman <ekoifman@hortonworks.co= m> wrote:
your query is =E2=80=9Cselect * from HIVE_LOCKS=E2=80=9D but the outpu= t is not from HIVE_LOCKS.
What entries do you have in HIVE_LOCKS for this txn_id?

If all you see is an entry in TXN table in =E2=80=98a=E2=80=99 state = =E2=80=93 that is OK. =C2=A0that just mean that this transaction was aborte= d.

Eugene

From: Igor Kuzmenko <f1sherox@gmail.com>
Reply-To: "user@hive.apache.org" <<= a href=3D"mailto:user@hive.apache.org" target=3D"_blank">user@hive.apache.o= rg>
Date: Monday, August 22, 2016 at 8:= 27 AM
To: "user@hive.apache.org" <user@hive.apache.org= >
Subject: Hive transaction doesn'= ;t release lock.

Hello, I'm using Apache Hive 1.2.1 and Apache Storm to= stream data in hive table.
After making some tests I=C2=A0tried=C2=A0to truncate my table, but sql execution doesn't complete because of the l= ock on table:

select * from HIVE_LOCKS;

# TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST, TC_= TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
'78461824', 'a', '1471762974000', '1471768= 488000', 'storm', 'sorm-data03.msk.mts.ru', '78461824', 'default&#= 39;, 'data_http', 'dt=3D20160821'



At the same time:

select * from TXNS, TXN_COMPONENTS
where 1=3D1
and TXNS.TXN_ID = =3D TXN_COMPONENTS.TC_TXNID=
and TXNS.TXN_ID = =3D 78461824
;


# TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST, TC_= TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru, 78461824, default, data_http, dt=3D20160821


Txn 78461824 is in aborted state, but still = keep lock on table 'data_http'. How can I avoid this? What should I= do to get rid of that lock?=C2=A0


--94eb2c06146af51ea4053acecb70--