hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18164) Much faster locality cost function and candidate generator
Date Mon, 26 Jun 2017 18:26:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063562#comment-16063562
] 

Hadoop QA commented on HBASE-18164:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} |
{color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 31s {color}
| {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 54s {color} |
{color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 43s {color}
| {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 34s {color}
| {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 7m 25s {color} | {color:red}
hbase-server in master has 10 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s {color} |
{color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 2s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 33s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 33s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 34s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 69m 11s {color}
| {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5
2.7.1 2.7.2 2.7.3 or 3.0.0-alpha3. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 6s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s {color} |
{color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 139m 21s {color} | {color:red}
hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 55s {color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 246m 47s {color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestRegionReplicaFailover |
|   | hadoop.hbase.security.access.TestCoprocessorWhitelistMasterObserver |
| Timed out junit tests | org.apache.hadoop.hbase.mapred.TestTableInputFormat |
|   | org.apache.hadoop.hbase.mapred.TestTableMapReduce |
|   | org.apache.hadoop.hbase.TestIOFencing |
|   | org.apache.hadoop.hbase.mapred.TestMultiTableSnapshotInputFormat |
|   | org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd |
|   | org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:757bf37 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12874507/HBASE-18164-07.patch
|
| JIRA Issue | HBASE-18164 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  hbaseanti  checkstyle
 compile  |
| uname | Linux ba01dd1a8c6b 4.8.3-std-1 #1 SMP Fri Oct 21 11:15:43 UTC 2016 x86_64 x86_64
x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
|
| git revision | master / 2d781aa |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7330/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
|
| unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7330/artifact/patchprocess/patch-unit-hbase-server.txt
|
| unit test logs |  https://builds.apache.org/job/PreCommit-HBASE-Build/7330/artifact/patchprocess/patch-unit-hbase-server.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7330/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7330/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Much faster locality cost function and candidate generator
> ----------------------------------------------------------
>
>                 Key: HBASE-18164
>                 URL: https://issues.apache.org/jira/browse/HBASE-18164
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer
>            Reporter: Kahlil Oppenheimer
>            Assignee: Kahlil Oppenheimer
>            Priority: Critical
>             Fix For: 3.0.0, 1.4.0, 2.0.0-alpha-2
>
>         Attachments: HBASE-18164-00.patch, HBASE-18164-01.patch, HBASE-18164-02.patch,
HBASE-18164-04.patch, HBASE-18164-05.patch, HBASE-18164-06.patch, HBASE-18164-07.patch, HBASE-18164-08.patch
>
>
> We noticed that during the stochastic load balancer was not scaling well with cluster
size. That is to say that on our smaller clusters (~17 tables, ~12 region servers, ~5k regions),
the balancer considers ~100,000 cluster configurations in 60s per balancer run, but only ~5,000
per 60s on our bigger clusters (~82 tables, ~160 region servers, ~13k regions) .
> Because of this, our bigger clusters are not able to converge on balance as quickly for
things like table skew, region load, etc. because the balancer does not have enough time to
"think".
> We have re-written the locality cost function to be incremental, meaning it only recomputes
cost based on the most recent region move proposed by the balancer, rather than recomputing
the cost across all regions/servers every iteration.
> Further, we also cache the locality of every region on every server at the beginning
of the balancer's execution for both the LocalityBasedCostFunction and the LocalityCandidateGenerator
to reference. This way, they need not collect all HDFS blocks of every region at each iteration
of the balancer.
> The changes have been running in all 6 of our production clusters and all 4 QA clusters
without issue. The speed improvements we noticed are massive. Our big clusters now consider
20x more cluster configurations.
> One design decision I made is to consider locality cost as the difference between the
best locality that is possible given the current cluster state, and the currently measured
locality. The old locality computation would measure the locality cost as the difference from
the current locality and 100% locality, but this new computation instead takes the difference
between the current locality for a given region and the best locality for that region in the
cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message