hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14312) RetryInvocationHandler may report ANN as SNN in messages.
Date Sat, 15 Apr 2017 06:41:41 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969829#comment-15969829
] 

Hadoop QA commented on HADOOP-14312:
------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 15s{color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 13s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  0s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 36s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  3s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 20s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 24s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 48s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 39s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 46s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 46s{color} | {color:green}
the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  0m 37s{color}
| {color:orange} hadoop-common-project/hadoop-common: The patch generated 1 new + 106 unchanged
- 0 fixed = 107 total (was 106) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  0s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 20s{color}
| {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  0s{color} | {color:red}
The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>.
Refer https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 35s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 48s{color} |
{color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 53s{color} | {color:red}
hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 33s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 50s{color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ac17dc |
| JIRA Issue | HADOOP-14312 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863537/HADOOP-14312.001.patch
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux 1d8dbdcbc0e9 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 25ac447 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | https://builds.apache.org/job/PreCommit-HADOOP-Build/12108/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt
|
| whitespace | https://builds.apache.org/job/PreCommit-HADOOP-Build/12108/artifact/patchprocess/whitespace-eol.txt
|
| unit | https://builds.apache.org/job/PreCommit-HADOOP-Build/12108/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/12108/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
|
| Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/12108/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RetryInvocationHandler  may report ANN as SNN in messages.
> ----------------------------------------------------------
>
>                 Key: HADOOP-14312
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14312
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HADOOP-14312.001.patch
>
>
> When multiple threads use the same DFSClient to make RPC calls, they may report incorrect
NN host name in messages like
>  INFO [pool-3-thread-13] retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(148))
- Exception while invoking delete of class ClientNamenodeProtocolTranslatorPB over hdpb-nn0001.prn.parsec.apple.com/*a.b.c.d*:8020.
Trying to fail over immediately.
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation
category WRITE is not supported in state standby. Visit https://s.apache.org/sbnn-error
> where *a.b.c.d* is the active NN, which confuses user to think failover is not behaving
correctly.
> The reason is that the ProxyDescriptor data field of RetryInvocationHandler may be shared
by multiple threads that do the RPC calls, the failover done by one thread may be visible
to other threads when reporting the above kind of message. 
> As an example, 
> # multiple threads start with the same SNN to do RPC calls, 
> # all threads discover that a failover is needed, 
> # thread X failover first, and changed the ProxyDescriptor's proxyInfo to ANN
> # other threads reports the above message with the proxyInfo changed by thread X, and
reported ANN instead of SNN in the message.
> Some details:
> RetryInvocationHandler does the following when failing over:
> {code}
>   synchronized void failover(long expectedFailoverCount, Method method,
>                                int callId) {
>       // Make sure that concurrent failed invocations only cause a single
>       // actual failover.
>       if (failoverCount == expectedFailoverCount) {
>         fpp.performFailover(proxyInfo.proxy);
>         failoverCount++;
>       } else {
>         LOG.warn("A failover has occurred since the start of call #" + callId
>             + " " + proxyInfo.getString(method.getName()));
>       }
>       proxyInfo = fpp.getProxy();
>     }
> {code}
> and changed the proxyInfo in the ProxyDescriptor.
> While the log method below report message with ProxyDescriotor's proxyinfo:
> {code}
> private void log(final Method method, final boolean isFailover,
>       final int failovers, final long delay, final Exception ex) {
> ......
>    final StringBuilder b = new StringBuilder()
>         .append(ex + ", while invoking ")
>         .append(proxyDescriptor.getProxyInfo().getString(method.getName()));
>     if (failovers > 0) {
>       b.append(" after ").append(failovers).append(" failover attempts");
>     }
>     b.append(isFailover? ". Trying to failover ": ". Retrying ");
>     b.append(delay > 0? "after sleeping for " + delay + "ms.": "immediately.");
> {code}
> and so does  {{handleException}} method do
> {code}
>         if (LOG.isDebugEnabled()) {
>           LOG.debug("Exception while invoking call #" + callId + " "
>               + proxyDescriptor.getProxyInfo().getString(method.getName())
>               + ". Not retrying because " + retryInfo.action.reason, e);
>         }
> {code}
> FailoverProxyProvider
> {code}
>    public String getString(String methodName) {
>       return proxy.getClass().getSimpleName() + "." + methodName
>           + " over " + proxyInfo;
>     }
>     @Override
>     public String toString() {
>       return proxy.getClass().getSimpleName() + " over " + proxyInfo;
>     }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message