hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-16564) StreamingAPI is locking too much?
Date Mon, 01 May 2017 20:21:04 GMT
Eugene Koifman created HIVE-16564:
-------------------------------------

             Summary: StreamingAPI is locking too much?
                 Key: HIVE-16564
                 URL: https://issues.apache.org/jira/browse/HIVE-16564
             Project: Hive
          Issue Type: Bug
          Components: HCatalog, Transactions
    Affects Versions: 1.0.0
            Reporter: Eugene Koifman
            Assignee: Eugene Koifman


Currently _TransactionBatchImpl.beginNextTransactionImpl()_ acquires Shared locks for each
Transaction in the batch.  
Especially under high load this creates pressure on the LockManager (i.e. Metastore) and degrades
performance of Ingest itself.
Because all transactions in a batch write to the same physical file and the fact that for
Acid tables (which are required for Streaming Ingest) shared locks only protect against Exclusive
locks (like drop table), acquiring/releasing locks doesn't for each txn doesn't achieve much.

One possibility to acquire all locks (i.e. for all txns) at the time the batch is created
(same as is done for openTxn() for all txns in the batch).  Locks for each txn in the batch
will be released automatically when commit is called for the respective txn.

Alternatively, don't acquire any locks - this means someone may drop a table while it's written
to but using locks here doesn't buy much.  Say a Drop request is issued when a write is in
progress.  It will block until the write releases it's lock and execute immediately after
that.  Thus none of the data of that write is visible for any meaningful length of time anyway.

Allow a "meta lock" - a lock not associated with any specific txn, that is held for the duration
of the TransactionBatch.  This sort of breaks the model (especially since HIVE-12636).  Perhaps
each batch can open one "extra" txn for internal purposes, just to acquire this "meta lock".
 No data will ever be tagged with this "extra" txn.







--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message