hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region
Date Mon, 22 May 2017 22:00:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020265#comment-16020265
] 

Ted Yu commented on HBASE-18090:
--------------------------------

For the new TableMapReduceUtil#initTableSnapshotMapJob method (in mapred package), please
add numSplitsPerRegion to @param
{code}
+    } else if (RegionSplitter.HexStringSplit.class.getSimpleName().equals(conf.get(SPLIT_ALGO)))
{
+      splitAlgo = new RegionSplitter.HexStringSplit();
+    }
{code}
Add an else block for handling the case where split algorithm is not specified.
{code}
+    if (splitAlgo == null && numSplitsPerRegion > 1) {
+      throw new IllegalArgumentException("Split algo can't be null, numSplits must be >=
1!");
{code}
The condition seems to imply that numSplits can be 1 if splitAlgo is null. Please modify the
error message to be more precise.

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --------------------------------------------------------------------------
>
>                 Key: HBASE-18090
>                 URL: https://issues.apache.org/jira/browse/HBASE-18090
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 1.4.0
>            Reporter: Mikhail Antonov
>         Attachments: HBASE-18090-branch-1.3-v1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. This places
unnecessary restriction that the region layout of the original table needs to take the processing
resources available to MR job into consideration. Allowing to run multiple mappers per region
(assuming reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message