hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Tawara (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-20361) Non-successive TableInputSplits may wrongly be merged by auto balancing feature
Date Sun, 08 Apr 2018 16:53:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-20361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429808#comment-16429808
] 

Yuki Tawara commented on HBASE-20361:
-------------------------------------

Hi, [~yuzhihong@gmail.com]

Thank you for your comments!
I reflected your comments(change the method name, add javadoc for the classes) and made trivial
changes(make visibility of the classes private, fix a typo in the subject).
Could you check my second patch?

> Non-successive TableInputSplits may wrongly be merged by auto balancing feature
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-20361
>                 URL: https://issues.apache.org/jira/browse/HBASE-20361
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>            Reporter: Yuki Tawara
>            Priority: Major
>         Attachments: HBASE-20361.master.001.patch, HBASE-20361.master.002.patch
>
>
> TableInputFormatBase class offers users a mechanism to exclude specific splits from returned
list of TableInputFormatBase#getSplits through TableInputFormatBase#includeRegionInSplit.
> It also offers users a feature called "auto balancing" to mitigate data skew by splitting
large splits and merging small splits.
> If a user overrides TableInputFormatBase#includeRegionInSplit, i th split and i+1 th
split may not be successive(i th split's end key is smaller than i+1 th split's start key).
> If he or she further enable auto balancing feature, non-successive splits can be merged,
which means excluded splits between merged non-successive splits "revive".
> To avoid such cases, we should not merge non-successive splits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message