beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jingsong Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-1531) Support dynamic work rebalancing for HBaseIO
Date Thu, 29 Jun 2017 02:59:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067629#comment-16067629
] 

Jingsong Lee commented on BEAM-1531:
------------------------------------

Of course, the embedded HBase server version is better, it is a complete mini hbase cluster.
So I only changed the testReadingSplitAtFraction (only involving Scanner.iterator) test, other
tests remain unchanged.
I think there is a tradeoff here, tradeoff of test accuracy and test speed. For testReadingSplitAtFraction
test, if we can effectively improve the speed, but also there is a good mock(query by startRow
and stopRow), we can achieve the purpose of our test. (test HBaseIO.splitAtFraction)

I carried out some tests, understand the realization of HBaseTestingUtility, which has a complete
miniHBaseCluster and miniZKCluster, JVM has 8000+ classes and 300+ threads when run. Then
it is very slow. I do not have a detailed understanding, probably need to do a cluster of
things, but let a JVM to do, resulting in a very slow running.

> Support dynamic work rebalancing for HBaseIO
> --------------------------------------------
>
>                 Key: BEAM-1531
>                 URL: https://issues.apache.org/jira/browse/BEAM-1531
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Ismaël Mejía
>            Assignee: Ismaël Mejía
>            Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message