hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Vary (JIRA)" <>
Subject [jira] [Commented] (HIVE-21506) Memory based TxnHandler implementation
Date Tue, 26 Mar 2019 06:17:00 GMT


Peter Vary commented on HIVE-21506:

What do you think?

CC: [~gopalv], [~tlipcon], [~vgumashta]

> Memory based TxnHandler implementation
> --------------------------------------
>                 Key: HIVE-21506
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Transactions
>            Reporter: Peter Vary
>            Priority: Major
> The current TxnHandler implementations are using the backend RDBMS to store every Hive
lock and transaction data, so multiple TxnHandler instances can run simultaneously and can
serve requests. The continuous communication/locking done on the RDBMS side puts serious load
on the backend databases also restricts the possible throughput.
> If it is possible to have only a single active TxnHandler (with the current design HMS)
instance then we can provide much better (using only java based locking) performance. We still
have to store the committed write transactions to the RDBMS (or later some other persistent
storage), but other lock and transaction operations could remain memory only.
> The most important drawbacks with this solution is that we definitely lose scalability
when one instance of TxnHandler is no longer able to serve the requests (see NameNode), and
fault tolerance in the sense that the ongoing transactions should be terminated when the TxnHandler
is failed. If this drawbacks are acceptable in certain situations the we can provide better
throughput for the users.

This message was sent by Atlassian JIRA

View raw message