hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liyin Tang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10659) [89-fb] Optimize the threading model in HBase write path
Date Tue, 04 Mar 2014 05:29:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919033#comment-13919033
] 

Liyin Tang commented on HBASE-10659:
------------------------------------

1) Since updating memstore is much faster than HLog syncing, one memstore-update-thread seems
to be sufficient. Or we can make it configurable as each HLogSyncer thread will have a corresponding
memstore-update-thread.

2)  The HLogSyncer thread will batch multiple transactions, as a group commit, from different
IPC writer threads, and then sync this group commit into HLog stream. And then, the memstore-update-thread
will take this group commit and update the corresponding memstore in (sequence id) order.

> [89-fb] Optimize the threading model in HBase write path
> --------------------------------------------------------
>
>                 Key: HBASE-10659
>                 URL: https://issues.apache.org/jira/browse/HBASE-10659
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Liyin Tang
>
> Recently, we have done multiple prototypes to optimize the HBase (0.89)write path. And
based on the simulator results, the following model is able to achieve much higher overall
throughput with less threads.
> IPC Writer Threads Pool: 
> IPC handler threads will prepare all Put requests, and append the WALEdit, as one transaction,
into a concurrent collection with a read lock. And then just return;
> HLogSyncer Thread:
> Each HLogSyncer thread is corresponding to one HLog stream. It swaps the concurrent collection
with a write lock, and then iterate over all the elements in the previous concurrent collection,
generate the sequence id for each transaction, and write to HLog. After the HLog sync is done,
append these transactions as a batch into a blocking queue. 
> Memstore Update Thread:
> The memstore update thread will poll the blocking queue and update the memstore for each
transaction by using the sequence id as MVCC. Once the memstore update is done, dispatch to
the responder thread pool to return to the client.
> Responder Thread Pool:
> Responder thread pool will return the RPC call in parallel. 
> We are still evaluating this model and will share more results/numbers once it is ready.
But really appreciate any comments in advance !



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message