hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tianyin Xu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-7726) Parse and check the configuration settings of edit log to prevent runtime errors
Date Tue, 03 Feb 2015 06:31:34 GMT

     [ https://issues.apache.org/jira/browse/HDFS-7726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tianyin Xu updated HDFS-7726:
-----------------------------
    Attachment: check_config_EditLogTailer.patch

The refined patch

> Parse and check the configuration settings of edit log to prevent runtime errors
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-7726
>                 URL: https://issues.apache.org/jira/browse/HDFS-7726
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Tianyin Xu
>            Priority: Minor
>         Attachments: check_config_EditLogTailer.patch, check_config_val_EditLogTailer.patch.1
>
>
> ============================
> Problem
> -------------------------------------------------
> Similar as the following two issues addressed in 2.7.0,
> https://issues.apache.org/jira/browse/YARN-2165
> https://issues.apache.org/jira/browse/YARN-2166
> The edit log related configuration settings should be checked in the constructor rather
than being applied directly at runtime. This would cause runtime failures if the values are
wrong.
> Take "dfs.ha.tail-edits.period" as an example, currently in EditLogTailer.java, its value
is not checked but directly used in doWork(), as the following code snippets. Any negative
values would cause IllegalArgumentException (which is not caught) and impair the component.

> {code:title=EditLogTailer.java|borderStyle=solid}
> private void doWork() {
> {
>     .....
>     Thread.sleep(sleepTimeMs);
>     ....
> }
> {code}
> Another example is "dfs.ha.log-roll.rpc.timeout". Right now, we use getInt() to parse
the value at runtime in the getActiveNodeProxy() function which is called by doWork(), shown
as below. Any erroneous settings (e.g., ill-formatted integer) would cause exceptions.
> {code:title=EditLogTailer.java|borderStyle=solid}
> private NamenodeProtocol getActiveNodeProxy() throws IOException {
> {
>     .....
>     int rpcTimeout = conf.getInt(
>           DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_KEY,
>           DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_DEFAULT);
>     ....
> }
> {code}
> ============================
> Solution (the attached patch)
> -------------------------------------------------
> Basically, the idea of the attached patch is to move the parsing and checking logics
into the constructor to expose the error at initialization, so that the errors won't be latent
at the runtime (same as YARN-2165 and YARN-2166)
> I'm not aware of the implementation of 2.7.0. It seems there's checking utilities such
as the validatePositiveNonZero function in YARN-2165. If so, we can use that one to make the
checking more systematic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message