hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen Liang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12216) Ozone: TestKeys and TestKeysRatis are failing consistently
Date Fri, 28 Jul 2017 18:28:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105447#comment-16105447
] 

Chen Liang commented on HDFS-12216:
-----------------------------------

Thanks [~msingh] for filing this! I was also just about to this :). I spend some time looking
at {{TestKeys}} fail. So just to add some info here.

Looks like {{testPutAndGetKeyWithDnRestart}} is the one that fails all the time. Seems that
after {{restartDatanode(cluster, 0, helper.client);}} gets called, the call to {{xceiverClientManager.acquireClient(pipeline);}}
will fail. The exact place where the exception is thrown is {{XceiverClientManager#getClient()}},
look like when XceverClient tries to connect the same server port as it was before the restart,
but the other end is simply no longer listening (Connection refused). I am currently thinking
it might be that the dn restart is not properly functioning somewhere, or the XceverClient
should be connecting to some different port after the restart. Haven't looked deeper though.
Hope this helps.

> Ozone: TestKeys and TestKeysRatis are failing consistently
> ----------------------------------------------------------
>
>                 Key: HDFS-12216
>                 URL: https://issues.apache.org/jira/browse/HDFS-12216
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Mukul Kumar Singh
>            Assignee: Mukul Kumar Singh
>             Fix For: HDFS-7240
>
>
> TestKeys and TestKeysRatis are failing consistently as noted in test logs for HDFS-12183
> TestKeysRatis is failing because of the following error
> {code}
> 2017-07-28 23:11:28,783 [StateMachineUpdater-127.0.0.1:55793] ERROR impl.StateMachineUpdater
(ExitUtils.java:terminate(80)) - Terminating with exit status 2: StateMachineUpdater-127.0.0.1:55793:
the StateMachineUpdater hits Throwable
> org.iq80.leveldb.DBException: Closed
> 	at org.fusesource.leveldbjni.internal.JniDB.put(JniDB.java:123)
> 	at org.apache.hadoop.utils.LevelDBStore.put(LevelDBStore.java:98)
> 	at org.apache.hadoop.ozone.container.common.impl.KeyManagerImpl.putKey(KeyManagerImpl.java:90)
> 	at org.apache.hadoop.ozone.container.common.impl.Dispatcher.handlePutKey(Dispatcher.java:547)
> 	at org.apache.hadoop.ozone.container.common.impl.Dispatcher.keyProcessHandler(Dispatcher.java:206)
> 	at org.apache.hadoop.ozone.container.common.impl.Dispatcher.dispatch(Dispatcher.java:110)
> 	at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatch(ContainerStateMachine.java:94)
> 	at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.applyTransaction(ContainerStateMachine.java:81)
> 	at org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:913)
> 	at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:142)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> where as TestKeys is failing because of
> {code}
> 2017-07-28 23:14:20,889 [Thread-486] INFO  scm.XceiverClientManager (XceiverClientManager.java:getClient(158))
- exception java.util.concurrent.ExecutionException: java.net.ConnectException: Connection
refused: /127.0.0.1:55914
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message