hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14512) WASB atomic rename should not throw exception if the file is neither in src nor in dst when doing the rename
Date Fri, 09 Jun 2017 18:50:18 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16044849#comment-16044849
] 

Hadoop QA commented on HADOOP-14512:
------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 17s{color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 21s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 18s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 14s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 21s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 27s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 14s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 17s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 15s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 15s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 11s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 17s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 33s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 11s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 18s{color} | {color:green}
hadoop-azure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 18s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 51s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-14512 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12872306/HADOOP-14512.002.patch
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux a486d27c94b9 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 99634d1 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
|  Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/12501/testReport/ |
| modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure |
| Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/12501/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> WASB atomic rename should not throw exception if the file is neither in src nor in dst
when doing the rename
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-14512
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14512
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 2.8.0
>            Reporter: Duo Xu
>            Assignee: Duo Xu
>         Attachments: HADOOP-14512.001.patch, HADOOP-14512.002.patch
>
>
> During atomic rename operation, WASB creates a rename pending json file to document which
files need to be renamed and the destination. Then WASB will read this file and rename all
the files one by one.
> There is a recent customer incident in HBase showing a potential bug in the atomic rename
implementation,
> For example, below is a rename pending json file,
> {code}
> {
>   FormatVersion: "1.0",
>   OperationUTCTime: "2017-04-29 06:08:57.465",
>   OldFolderName: "hbase\/data\/default\/abc",
>   NewFolderName: "hbase\/.tmp\/data\/default\/abc",
>   FileList: [
>     ".tabledesc",
>     ".tabledesc\/.tableinfo.0000000001",
>     ".tmp",
>     "08e698e0b7d4132c0456b16dcf3772af",
>     "08e698e0b7d4132c0456b16dcf3772af\/.regioninfo",
>     "08e698e0b7d4132c0456b16dcf3772af\/0\/617294e0737e4d37920e1609cf539a83",
>     "08e698e0b7d4132c0456b16dcf3772af\/recovered.edits\/185.seqid",
>     "08e698e0b7d4132c0456b16dcf3772af\/.regioninfo",
>     "08e698e0b7d4132c0456b16dcf3772af\/0",
>  "08e698e0b7d4132c0456b16dcf3772af\/0\/617294e0737e4d37920e1609cf539a83",
>     "08e698e0b7d4132c0456b16dcf3772af\/recovered.edits",
>     "08e698e0b7d4132c0456b16dcf3772af\/recovered.edits\/185.seqid"
>   ]
> }
> {code}  
> When HBase regionserver process (underlying is using WASB driver) was renaming  "08e698e0b7d4132c0456b16dcf3772af\/.regioninfo",
the regionserver process crashed or the VM got rebooted due to system maintenence. When the
regionserver process started running again, it found the rename pending json file and tried
to redo the rename operation. 
> However, when it read the first file ".tabledesc" in the file list, it could not find
this file in src folder and it also could not find the file in destination folder. It could
not find it in src folder because the file had already been renamed/moved to the destination
folder. It could not find it in destination folder because when HBase starts, it will clean
up all the files under /hbase/.tmp.
> The current implementation will throw exceptions saying
> {code}
> else {
>         throw new IOException(
>             "Attempting to complete rename of file " + srcKey + "/" + fileName
>             + " during folder rename redo, and file was not found in source "
>             + "or destination.");
>       }
> {code}
> This will cause HBase HMaster initialization failure and restart HMaster will not work
because the same exception will throw again.
> My proposal is that if during the redo, WASB finds a file not in src and not in dst,
WASB should just skip this file and process the next file rather than throw the error and
let user manually fix it. Reasons are
> 1. Since the rename pending json file contains file A, if the file A is not in src, it
must have been renamed.
> 2. if the file A is not in src and not in dst, the upper layer service must have  removed
it. One thing to note is that during the atomic rename, the folder is locked. So the only
situation the file gets deleted is when VM reboots or service process crashes. When service
process restarts, there might be some operations happening before the atomic rename redo,
like the HBase example above.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message