hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5453) Could FSEditLog report problems more elegantly than with System.exit(-1)
Date Tue, 10 Mar 2009 17:32:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680545#action_12680545
] 

Konstantin Shvachko commented on HADOOP-5453:
---------------------------------------------

FSEditLog calls {{System.exit(-1)} when there are no more edit streams to write the name-space
modifications to. No streams means the name-space state is not persistent anymore and may
not be current when the name-node restarts.
So this is not about reporting problems but rather about the consistency of the system. Namely,
if the system cannot persist changes it dies.
Though I agree dying might not be the most elegant solution. Now since we have "saveNamespace"
command the loss of all edit streams can be treated as just switching to safe mode. When local
disks are restored the administrator can save the namespace. Alternatively a secondary node
can be started to perform an emergency checkpoint.


> Could FSEditLog report problems more elegantly than with System.exit(-1)
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-5453
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5453
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.21.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> When FSEdit encounters problems, it prints something and then exits.
> It would be better for any in-JVM deployments of FSEdit for these to be raised in some
other way (such as throwing an exception), rather than taking down the whole JVM. That could
be in JUnit tests, or it could be inside other applications. Test runners and the like can
intercept those System.exit() calls with their own Security Manager -often turning the System.exit()
operation into an exception there and then. If FSEdit did that itself, it may be easier to
stay in control. 
> The current approach has some benefits -it can exit regardless of which thread has encountered
problems, but it is tricky to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message