hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16894) Create more than 1 split per region, generalize HBASE-12590
Date Tue, 03 Oct 2017 01:54:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189169#comment-16189169
] 

Hadoop QA commented on HBASE-16894:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 27s{color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  0s{color}
| {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m  0s{color}
| {color:green} The patch appears to include 3 new or modified test files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m  7s{color}
| {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 33s{color} |
{color:green} branch-1 passed with JDK v1.8.0_144 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 34s{color} |
{color:green} branch-1 passed with JDK v1.7.0_151 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 56s{color}
| {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 17s{color}
| {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 41s{color}
| {color:green} branch has no errors when building our shaded downstream artifacts. {color}
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 57s{color} |
{color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 25s{color} |
{color:green} branch-1 passed with JDK v1.8.0_144 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 34s{color} |
{color:green} branch-1 passed with JDK v1.7.0_151 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 40s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 32s{color} |
{color:green} the patch passed with JDK v1.8.0_144 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 32s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 35s{color} |
{color:green} the patch passed with JDK v1.7.0_151 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 35s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 55s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 17s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 26s{color}
| {color:green} patch has no errors when building our shaded downstream artifacts. {color}
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 18m 10s{color} | {color:red}
The patch causes 178 errors with Hadoop v3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 12s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 29s{color} |
{color:green} the patch passed with JDK v1.8.0_144 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 34s{color} |
{color:green} the patch passed with JDK v1.7.0_151 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}393m  8s{color} | {color:red}
hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  3m 49s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}432m 36s{color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.snapshot.TestRegionSnapshotTask |
|   | hadoop.hbase.client.TestAdmin2 |
|   | hadoop.hbase.master.TestMasterBalanceThrottling |
|   | hadoop.hbase.regionserver.TestRecoveredEdits |
|   | hadoop.hbase.security.visibility.TestVisibilityLabelsWithACL |
| Timed out junit tests | org.apache.hadoop.hbase.replication.TestReplicationKillSlaveRS |
|   | org.apache.hadoop.hbase.regionserver.compactions.TestFIFOCompactionPolicy |
|   | org.apache.hadoop.hbase.replication.regionserver.TestReplicator |
|   | org.apache.hadoop.hbase.replication.TestReplicationChangingPeerRegionservers |
|   | org.apache.hadoop.hbase.master.TestMasterFailover |
|   | org.apache.hadoop.hbase.TestHBaseTestingUtility |
|   | org.apache.hadoop.hbase.mapred.TestTableInputFormat |
|   | org.apache.hadoop.hbase.TestFullLogReconstruction |
|   | org.apache.hadoop.hbase.mapred.TestTableMapReduceUtil |
|   | org.apache.hadoop.hbase.replication.regionserver.TestWALEntryStream |
|   | org.apache.hadoop.hbase.regionserver.TestRegionReplicas |
|   | org.apache.hadoop.hbase.regionserver.TestRegionServerAbort |
|   | org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort |
|   | org.apache.hadoop.hbase.client.TestMetaWithReplicas |
|   | org.apache.hadoop.hbase.master.TestTableLockManager |
|   | org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS |
|   | org.apache.hadoop.hbase.replication.TestReplicationStatus |
|   | org.apache.hadoop.hbase.TestClusterBootOrder |
|   | org.apache.hadoop.hbase.util.TestMiniClusterLoadEncoded |
|   | org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential |
|   | org.apache.hadoop.hbase.regionserver.wal.TestLogRolling |
|   | org.apache.hadoop.hbase.master.TestMasterShutdown |
|   | org.apache.hadoop.hbase.regionserver.throttle.TestCompactionWithThroughputController
|
|   | org.apache.hadoop.hbase.replication.regionserver.TestReplicationSink |
|   | org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster |
|   | org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush |
|   | org.apache.hadoop.hbase.regionserver.TestJoinedScanners |
|   | org.apache.hadoop.hbase.snapshot.TestSecureExportSnapshot |
|   | org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController |
|   | org.apache.hadoop.hbase.TestAcidGuarantees |
|   | org.apache.hadoop.hbase.TestNamespace |
|   | org.apache.hadoop.hbase.replication.TestReplicationSyncUpTool |
|   | org.apache.hadoop.hbase.replication.multiwal.TestReplicationSyncUpToolWithMultipleWAL
|
|   | org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster |
|   | org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter |
|   | org.apache.hadoop.hbase.snapshot.TestExportSnapshot |
|   | org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes |
|   | org.apache.hadoop.hbase.TestInfoServers |
|   | org.apache.hadoop.hbase.client.replication.TestReplicationAdminWithTwoDifferentZKClusters
|
|   | org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase |
|   | org.apache.hadoop.hbase.wal.TestWALOpenAfterDNRollingStart |
|   | org.apache.hadoop.hbase.replication.TestReplicationWithTags |
|   | org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpointNoMaster
|
|   | org.apache.hadoop.hbase.master.TestRestartCluster |
|   | org.apache.hadoop.hbase.TestIOFencing |
|   | org.apache.hadoop.hbase.replication.TestReplicationSmallTests |
|   | org.apache.hadoop.hbase.regionserver.TestHRegion |
|   | org.apache.hadoop.hbase.util.TestMiniClusterLoadParallel |
|   | org.apache.hadoop.hbase.snapshot.TestRestoreFlushSnapshotFromClient |
|   | org.apache.hadoop.hbase.replication.TestReplicationKillMasterRS |
|   | org.apache.hadoop.hbase.mapreduce.TestTimeRangeMapRed |
|   | org.apache.hadoop.hbase.io.encoding.TestLoadAndSwitchEncodeOnDisk |
|   | org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite |
|   | org.apache.hadoop.hbase.security.visibility.TestDefaultScanLabelGeneratorStack |
|   | org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove |
|   | org.apache.hadoop.hbase.replication.TestReplicationKillMasterRSCompressed |
|   | org.apache.hadoop.hbase.TestHColumnDescriptorDefaultVersions |
|   | org.apache.hadoop.hbase.regionserver.TestClusterId |
|   | org.apache.hadoop.hbase.regionserver.TestRegionServerHostname |
|   | org.apache.hadoop.hbase.fs.TestBlockReorder |
|   | org.apache.hadoop.hbase.client.TestTableSnapshotScanner |
|   | org.apache.hadoop.hbase.snapshot.TestFlushSnapshotFromClient |
|   | org.apache.hadoop.hbase.TestMetaTableAccessor |
|   | org.apache.hadoop.hbase.regionserver.TestEncryptionKeyRotation |
|   | org.apache.hadoop.hbase.replication.TestMultiSlaveReplication |
|   | org.apache.hadoop.hbase.replication.TestPerTableCFReplication |
|   | org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover |
|   | org.apache.hadoop.hbase.master.handler.TestCreateTableHandler |
|   | org.apache.hadoop.hbase.master.handler.TestEnableTableHandler |
|   | org.apache.hadoop.hbase.wal.TestWALFactory |
|   | org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat |
|   | org.apache.hadoop.hbase.replication.TestSerialReplication |
|   | org.apache.hadoop.hbase.replication.TestMasterReplication |
|   | org.apache.hadoop.hbase.tool.TestCanaryTool |
|   | org.apache.hadoop.hbase.io.TestFileLink |
|   | org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat |
|   | org.apache.hadoop.hbase.util.TestFSUtils |
|   | org.apache.hadoop.hbase.wal.TestWALSplit |
|   | org.apache.hadoop.hbase.io.encoding.TestEncodedSeekers |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f1cc2c |
| JIRA Issue | HBASE-16894 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12890004/HBASE-16894.branch-1.patch
|
| Optional Tests |  asflicense  shadedjars  javac  javadoc  unit  findbugs  hadoopcheck  hbaseanti
 checkstyle  compile  |
| uname | Linux abb1c253cda6 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
|
| git revision | branch-1 / 6117352 |
| Default Java | 1.7.0_151 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_144 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_151
|
| findbugs | v3.0.0 |
| unit | https://builds.apache.org/job/PreCommit-HBASE-Build/8896/artifact/patchprocess/patch-unit-hbase-server.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/8896/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/8896/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> Create more than 1 split per region, generalize HBASE-12590
> -----------------------------------------------------------
>
>                 Key: HBASE-16894
>                 URL: https://issues.apache.org/jira/browse/HBASE-16894
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0, 2.0.0-alpha-2
>            Reporter: Enis Soztutar
>            Assignee: Yi Liang
>             Fix For: 2.0.0
>
>         Attachments: HBASE-16894.branch-1.patch, HBASE-16894.master.patch, HBASE-16894-V2-master.patch,
HBASE-16894-V3-master.patch, ImplementaionAndSomeQuestion.docx
>
>
> A common request from users is to be able to better control how many map tasks are created
per region. Right now, it is always 1 region = 1 input split = 1 map task. Same goes for Spark
since it uses the TIF. With region sizes as large as 50 GBs, it is desirable to be able to
create more than 1 split per region.
> HBASE-12590 adds a config property for MR jobs to be able to handle skew in region sizes.
The algorithm is roughly: 
> {code}
> If (region size >= average size*ratio) : cut the region into two MR input splits
> If (average size <= region size < average size*ratio) : one region as one MR input
split
> If (sum of several continuous regions size < average size * ratio): combine these
regions into one MR input split.
> {code}
> Although we can set data skew ratio to be 0.5 or something to abuse HBASE-12590 into
creating more than 1 split task per region, it is not ideal. But there is no way to create
more with the patch as it is. For example we cannot create more than 2 tasks per region. 
> If we want to fix this properly, we should extend the approach in HBASE-12590, and make
it so that the client can specify the desired num of mappers, or desired split size, and the
TIF generates the splits based on the current region sizes very similar to the algorithm in
HBASE-12590, but a more generic way. This also would eliminate the hand tuning of data skew
ratio.
> We also can think about the guidepost approach that Phoenix has in the stats table which
is used for exactly this purpose. Right now, the region can be split into powers of two assuming
uniform distribution within the region. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message