hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Boudnik (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5974) Add orthogonal fault injection mechanism/framework
Date Fri, 05 Jun 2009 19:50:07 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716737#action_12716737
] 

Konstantin Boudnik commented on HADOOP-5974:
--------------------------------------------

Here's an overall proposition of the framework layout:
- AspectJ 1.6 should be used as the base framework
- additional set of classes needs to be developed to control and configure injection of the
faults at the runtime. In the first version of the framework, I'd recommend to go with with
randomly (in terms of their happening, not their location in the application code)  injected
faults
- randomization level might be configured through system properties from the command line
or set in a separate configuration file
- to completely turn off faults injection for a class the probability level has to be set
to 0% ('zero'); setting to 100% will achieve the opposite effect
- build.xml has to be extended with a new target ('injectfaults') to weave needed aspects
in place after the normal compilation of Java classes is done; JUnit targets will have to
be modified to pass new probability configuration parameters into spawn JVM
- aspects' source code will be place under test/src/aop; package structure will mimic the
original one of Hadoop. Say an aspect for FSDataset has to belong to org.apache.hadoop.hdfs.server.datanode

Some examples of new build/test execution interface:

To weave (build-in) aspects in place: 
- % ant injectfaults 
- To execute HDFS tests (turn everything off, but BlockReceiver faults, which set at 10% level):

% ant run-test-hdfs -DallFaultProbability=0 -DBlockReceiverFaultProbability=10 


> Add orthogonal fault injection mechanism/framework
> --------------------------------------------------
>
>                 Key: HADOOP-5974
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5974
>             Project: Hadoop Core
>          Issue Type: Test
>          Components: test
>            Reporter: Konstantin Boudnik
>            Assignee: Konstantin Boudnik
>
> It'd be great to have a fault injection mechanism for Hadoop.
> Having such solution in place will allow to increase test coverage of error handling
and recovery mechanisms, reduce reproduction time and increase the reproduction rate of the
problems.
> Ideally, the system has to be orthogonal to the current code and test base. E.g. faults
have to be injected at build time and would have to be configurable, e.g. all faults could
be turned off, or only some of them would be allowed to happen. Also, fault injection has
to be separated from production build. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message