impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joe McDonnell (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5386: Fix ReopenCachedHdfsFileHandle failure case
Date Wed, 31 May 2017 00:49:13 GMT
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5386: Fix ReopenCachedHdfsFileHandle failure case

Patch Set 2:

> Did you look into the conditions which triggered the failure to
 > begin with ? Is there any way to trigger similar error locally with
 > debug action or stress flag ? It would be good to add a test for
 > this case.

I know the sequence of events:
1. File is deleted using hdfs command line
2. Run a query over the table that has the deleted file
3. ScanRange::Open succeeds (!!)
4. ScanRange::Read tries hdfsRead and fails, destroys the file handle, and reopening the file
handle fails. The ScanRange's file handle reference is now invalid, but it is also non-null.
5. Query is aborted, leading to ScanRange::Cancel
6. ScanRange::Cancel calls ScanRange::Close, which sees that the file handle reference is
non-null and tries to release it. The release fails, because the file handle reference is

The problem with reproducing this on normal Hdfs is that when a file is deleted, the subsequent
Open in #2 fails, so the query never even has a file handle. If the Open succeeded, it is
a file handle to a local file, so POSIX guarantees that the file stays around. I have tried
modifying the code and modifying the test to produce this sequence, but it is difficult to
get this particular combination. I'm looking at what it would take.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Iee982fa5e964f6c8969b2eb7e5f3eca89e793b3a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <>
Gerrit-Reviewer: Joe McDonnell <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-HasComments: No

View raw message