hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <>
Subject [jira] [Commented] (HIVE-14778) document threading model of Streaming API
Date Tue, 27 Sep 2016 17:16:20 GMT


Alan Gates commented on HIVE-14778:

These changes appear to say that the streaming is single threaded.  I don't think that's what
you mean, but I want to make sure I understand what you're saying, which I think is the following:

A single HiveEndPoint object cannot support having more than one TransactionBatch open and
being committed to at the same time.  Also it does not properly support multiple threads committing
in parallel, even inside one TransactionBatch.  However, it does support multiple threads
as long as the commits are serialized.

Is that correct?

> document threading model of Streaming API
> -----------------------------------------
>                 Key: HIVE-14778
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: HCatalog, Transactions
>    Affects Versions: 0.14.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>         Attachments: HIVE-14778.patch
>   Original Estimate: 1h
>  Remaining Estimate: 1h
> The model is not obvious and needs to be documented properly.
> A StreamingConnection internally maintains 2 MetaStoreClient objects (each has 1 Thrift
client for actual RPC). Let's call them "primary" and "heartbeat". Each TransactionBatch created
from a given StreamingConnection, gets a reference to both of these MetaStoreClients. 
> So the model is that there is at most 1 outstanding (not closed) TransactionBatch for
any given StreamingConnection and for any given TransactionBatch there can be at most 2 threads
accessing it concurrently. 1 thread calling TransactionBatch.heartbeat() (and nothing else)
and the other calling all other methods.

This message was sent by Atlassian JIRA

View raw message