hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4210) NameNode Format should not fail for DNS resolution on minority of JournalNode
Date Mon, 19 Nov 2012 18:33:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500475#comment-13500475
] 

Colin Patrick McCabe commented on HDFS-4210:
--------------------------------------------

It should definitely throw a more helpful exception than {{NullPointerException}}.  However,
I think the general idea that the quorum format should fail if some {{JournalNodes}} could
not be formatted makes some sense.  If some {{JournalNodes}} could not be formatted, the system
is running at reduced redundancy.  This could cause major problems down the road if we silently
return success here.

Can you write a script to wait until all nodes are accessible (keep pinging every second until
you get through, or something like that)?

Alternately, perhaps we could add a switch like {{\-partial}} that would return success from
a partial format as long as a quorum of JNs got formatted.  But I don't think it should be
the default...
                
> NameNode Format should not fail for DNS resolution on minority of JournalNode
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-4210
>                 URL: https://issues.apache.org/jira/browse/HDFS-4210
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, journal-node, name-node
>    Affects Versions: 2.0.0-alpha
>         Environment: CDH4.1.2
>            Reporter: Damien Hardy
>            Priority: Trivial
>
> Setting  : 
>   qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
>   cdh4master01 and cdh4master02 JournalNode up and running, 
>   cdh4worker03 not yet provisionning (no DNS entrie)
> With :
> `hadoop namenode -format` fails with :
>   12/11/19 14:42:42 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalArgumentException: Unable to construct journal, qjournal://cdh4master01:8485;cdh4master02:8485;cdh4worker03:8485/hdfscluster
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1235)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:226)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:193)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:745)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1099)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> Caused by: java.lang.reflect.InvocationTargetException
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1233)
> 	... 5 more
> Caused by: java.lang.NullPointerException
> 	at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
> 	at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
> 	at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.<init>(IPCLoggerChannel.java:161)
> 	at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:141)
> 	at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:353)
> 	at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:135)
> 	at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.<init>(QuorumJournalManager.java:104)
> 	at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.<init>(QuorumJournalManager.java:93)
> 	... 10 more
> I suggest that if quorum is up format should not fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message