ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anandha L Ranganathan <analog.s...@gmail.com>
Subject Re: Ambari upgrade 2.4 - not yet finalized
Date Fri, 20 May 2016 05:37:48 GMT
 I figured out the solution.

I restarted manually the Active NN  and standby NN with this command.
/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start
namenode -rollingUpgrade started


Once it started successfully from Ambari UI , I ran the finalize and it
completed successfully.




On Thu, May 19, 2016 at 9:15 PM, Anandha L Ranganathan <
analog.sony@gmail.com> wrote:

> Hi Jonathan
>
> I tried to run it using ambari UI but it failed to with this exception.
> Is there a way I can restart manually  from command line.
>
> Our cluster are enabled  with Namenode HA .
>
>
> 2016-05-20 04:01:36,027 - Execute['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin
-fs hdfs://dfs-nameservices -rollingUpgrade query'] {'logoutput': True, 'user': 'hdfs'}
> QUERY rolling upgrade ...
> 16/05/20 04:01:38 INFO retry.RetryInvocationHandler: Exception while invoking rollingUpgrade
of class ClientNamenodeProtocolTranslatorPB over usw2dxdpma02.local/172.17.212.157:8020 after
1 fail over attempts. Trying to fail over after sleeping for 568ms.
> java.net.ConnectException: Call From usw2dxdpma02.glassdoor.local/172.17.212.157 to usw2dxdpma02.local:8020
failed on connection exception: java.net.ConnectException: Connection refused; For more details
see:  http://wiki.apache.org/hadoop/ConnectionRefused
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> 	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
> 	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1431)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1358)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> 	at com.sun.proxy.$Proxy9.rollingUpgrade(Unknown Source)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.rollingUpgrade(ClientNamenodeProtocolTranslatorPB.java:728)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:497)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> 	at com.sun.proxy.$Proxy10.rollingUpgrade(Unknown Source)
> 	at org.apache.hadoop.hdfs.DFSClient.rollingUpgrade(DFSClient.java:2956)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.rollingUpgrade(DistributedFileSystem.java:1287)
> 	at org.apache.hadoop.hdfs.tools.DFSAdmin$RollingUpgradeCommand.run(DFSAdmin.java:373)
> 	at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1815)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> 	at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:1973)
> Caused by: java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> 	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> 	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
> 	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
> 	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:612)
> 	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:710)
> 	at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:373)
> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1493)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1397)
> 	... 18 more
>
>
>
>
> On Thu, May 19, 2016 at 5:08 AM, Jonathan Hurley <jhurley@hortonworks.com>
> wrote:
>
>> You hitting an instance of
>> https://issues.apache.org/jira/browse/AMBARI-15482
>>
>> I don't know of a way around this aside from:
>> - Finalizing the upgrade
>> - Starting NameNode manually from the command prompt
>>
>> It's probably best to just finalize the upgrade and start NameNode from
>> the web client after finalization.
>>
>> On May 18, 2016, at 10:02 PM, Anandha L Ranganathan <
>> analog.sony@gmail.com> wrote:
>>
>> I am running rolling upgrade in dev cluster .    It is completed with
>> 100% but not yet finalized.
>> I was testing int he dev cluster and validating is everything working
>> fine.   I was able to run the Hive Query using HS2 server.
>>
>> I don't remember what is the reason but I restarted all namenode services
>> through Ambari UI and I started getting this error.    It says run with
>> --upgrade option. I thorugh rolling upgrade would take care of it.   Please
>> help  me how do I handle this ? What are the steps should I do ?
>>
>>
>> 2016-05-19 01:42:38,561 INFO  util.GSet
>> (LightWeightGSet.java:computeCapacity(356)) - 0.029999999329447746% max
>> memory 1011.3 MB = 310.7 KB
>> 2016-05-19 01:42:38,561 INFO  util.GSet
>> (LightWeightGSet.java:computeCapacity(361)) - capacity      = 2^15 = 32768
>> entries
>> 2016-05-19 01:42:38,579 INFO  common.Storage (Storage.java:tryLock(715))
>> - Lock on /mnt/data/hadoop/hdfs/namenode/in_use.lock acquired by nodename
>> 13159@usw2dxdpma01.glassdoor.local
>> 2016-05-19 01:42:38,651 WARN  namenode.FSNamesystem
>> (FSNamesystem.java:loadFromDisk(690)) - Encountered exception loading
>> fsimage
>> java.io.IOException:
>> File system image contains an old layout version -60.
>> An upgrade to version -63 is required.
>> Please restart NameNode with the "-rollingUpgrade started" option if a
>> rolling upgrade is already started; or restart NameNode with the "-upgrade"
>> option to start a new upgrade.
>>     at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:245)
>>     at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
>>     at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
>>     at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
>>     at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:722)
>>     at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:951)
>>     at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:935)
>>     at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
>>     at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
>> 2016-05-19 01:42:38,661 INFO  mortbay.log (Slf4jLog.java:info(67)) -
>> Stopped
>> HttpServer2$SelectChannelConnectorWithSafeStartup@usw2dxdpma01.glassdoor.local
>> :50070
>> 2016-05-19 01:42:38,663 INFO  impl.MetricsSystemImpl
>> (MetricsSystemImpl.java:stop(211)) - Stopping NameNode metrics system...
>> 2016-05-19 01:42:38,664 INFO  impl.MetricsSinkAdapter
>> (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread
>> interrupted.
>> 2016-05-19 01:42:38,664 INFO  impl.MetricsSystemImpl
>> (MetricsSystemImpl.java:stop(217)) - NameNode metrics system stopped.
>> 2016-05-19 01:42:38,664 INFO  impl.MetricsSystemImpl
>> (MetricsSystemImpl.java:shutdown(607)) - NameNode metrics system shutdown
>> complete.
>> 2016-05-19 01:42:38,665 ERROR namenode.NameNode
>> (NameNode.java:main(1712)) - Failed to start namenode.
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message