hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcin Cylke <mcl.had...@touk.pl>
Subject JournalNode desynchronized
Date Wed, 30 Jan 2013 10:37:52 GMT
Hi

I had a failure of one of the machines my JournalNode is running on.
I've restored that machine's setup and would like to attach her to the
existing JournalNode Quorum.

When I try to run it I get the following error:

 ERROR org.apache.hadoop.security.UserGroupInformation:
PriviledgedActionException as:hdfs (auth:SIMPLE)
cause:org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException:
Journal Storage Directory /hadoop/dfs/journalnode/hadoop-cluster not
formatted
2013-01-28 12:13:45,050 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 8485, call
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.getEditLogManifest
from 10.10.105.5:57604: error:
org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException:
Journal Storage Directory /hadoop/dfs/journalnode/hadoop-cluster not
formatted
org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException:
Journal Storage Directory /hadoop/dfs/journalnode/hadoop-cluster not
formatted
        at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkFormatted(Journal.java:442)
        at
org.apache.hadoop.hdfs.qjournal.server.Journal.getEditLogManifest(Journal.java:625)
        at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.getEditLogManifest(JournalNodeRpcServer.java:177)
        at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.getEditLogManifest(QJournalProtocolServerSideTranslatorPB.java:196)
        at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:14028)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)


How to fix this kind of issue? JournalNode directory looks as follows:

bash-4.1$ ls -R /hadoop/dfs/journalnode/
/hadoop/dfs/journalnode/:
hadoop-cluster

/hadoop/dfs/journalnode/hadoop-cluster:
current

/hadoop/dfs/journalnode/hadoop-cluster/current:
committed-txid  last-promised-epoch

So, there are no edits' files in there and the most reasonable way would
be to sync them in some way. The best solution that comes to mind is
stopping the cluster, copying over all the edits to the "new" server and
then starting journals again.

Is there an easier and on-line way to do that?
I'd appreciate some solution that would not require formatting nameNode :)

Regards
Marcin

Mime
View raw message