hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruslan Dautkhanov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
Date Thu, 14 Sep 2017 21:01:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166959#comment-16166959
] 

Ruslan Dautkhanov commented on HDFS-11851:
------------------------------------------

After applying this patch program started core dumping - here's gdb back trace 

{code}
(gdb) bt
#0  0x00007fe78a34b1d7 in raise () from /lib64/libc.so.6
#1  0x00007fe78a34c8c8 in abort () from /lib64/libc.so.6
#2  0x00007fe78b212185 in os::abort(bool) () from /usr/java/default/jre/lib/amd64/server/libjvm.so
#3  0x00007fe78b3b4593 in VMError::report_and_die() () from /usr/java/default/jre/lib/amd64/server/libjvm.so
#4  0x00007fe78b21768f in JVM_handle_linux_signal () from /usr/java/default/jre/lib/amd64/server/libjvm.so
#5  0x00007fe78b20dbe3 in signalHandler(int, siginfo*, void*) () from /usr/java/default/jre/lib/amd64/server/libjvm.so
#6  <signal handler called>
#7  0x00007fe78a6db8b0 in setTLSExceptionStrings () from /opt/cloudera/parcels/CDH/lib64/libhdfs.so.0.0.0
#8  0x00007fe78a6da52c in printExceptionAndFreeV () from /opt/cloudera/parcels/CDH/lib64/libhdfs.so.0.0.0
#9  0x00007fe78a6da6cd in printExceptionAndFree () from /opt/cloudera/parcels/CDH/lib64/libhdfs.so.0.0.0
#10 0x00007fe78a6db60b in getJNIEnv () from /opt/cloudera/parcels/CDH/lib64/libhdfs.so.0.0.0
#11 0x00007fe78a6dd034 in hdfsBuilderConnect () from /opt/cloudera/parcels/CDH/lib64/libhdfs.so.0.0.0
#12 0x0000000000400950 in main ()
{code}

As you can see it happens in setTLSExceptionStrings () so definitely related to this patch.

I can upload a hs_err*.log log file if it will be helpful.


> getGlobalJNIEnv() may deadlock if exception is thrown
> -----------------------------------------------------
>
>                 Key: HDFS-11851
>                 URL: https://issues.apache.org/jira/browse/HDFS-11851
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 3.0.0-alpha4
>            Reporter: Henry Robinson
>            Assignee: Sailesh Mukil
>            Priority: Blocker
>             Fix For: 3.0.0-alpha4
>
>         Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, HDFS-11851.002.patch,
HDFS-11851.003.patch, HDFS-11851.004.patch, HDFS-11851.005.patch
>
>
> HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception is thrown.
{{getGlobalJNIEnv()}} holds {{jvmMutex}}, but {{printExceptionAndFree()}} will eventually
try to acquire that lock in {{setTLSExceptionStrings()}}.
> The exception might get caught from {{loadFileSystems}}:
> {code}
> jthr = invokeMethod(env, NULL, STATIC, NULL,
>                          "org/apache/hadoop/fs/FileSystem",
>                          "loadFileSystems", "()V");
>         if (jthr) {
>             printExceptionAndFree(env, jthr, PRINT_EXC_ALL, "loadFileSystems");
>         }
>     }
> {code}
> and here's the relevant parts of the stack trace from where I call this API in Impala,
which uses {{libhdfs}}:
> {code}
> #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
> #1  0x00007ffff4a8d657 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0
> #2  0x00007ffff4a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 <jvmMutex>)
at ../nptl/pthread_mutex_lock.c:79
> #3  0x0000000002f06056 in mutexLock (m=<optimized out>) at /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28
> #4  0x0000000002efe817 in setTLSExceptionStrings (rootCause=0x0, stackTrace=0x0) at /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581
> #5  0x0000000002f065d7 in printExceptionAndFreeV (env=0x513c1e8, exc=0x508a8c0, noPrintFlags=<optimized
out>, fmt=0x34349cf "loadFileSystems", ap=0x7fffffffb660)
>     at /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183
> #6  0x0000000002f0683d in printExceptionAndFree (env=<optimized out>, exc=<optimized
out>, noPrintFlags=<optimized out>, fmt=<optimized out>)
>     at /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213
> #7  0x0000000002eff60f in getGlobalJNIEnv () at /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message