hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-12029) Data node process crashes after kernel upgrade
Date Fri, 23 Jun 2017 21:53:00 GMT
Anu Engineer created HDFS-12029:

             Summary:  Data node process crashes after kernel upgrade
                 Key: HDFS-12029
                 URL: https://issues.apache.org/jira/browse/HDFS-12029
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
            Reporter: Anu Engineer
            Priority: Critical

 We have seen that when Linux kernel is upgraded to address a specific CVE 
 ( https://access.redhat.com/security/vulnerabilities/stackguard ) it might cause a datanode

We have observed this issue while upgrading from 3.10.0-514.6.2 to 3.10.0-514.21.2 versions
of the kernel.

Original kernel fix is here -- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1be7107fbe18eed3e319a6c3e83c78254b693acb

Datanode fails with the following stack trace, 


# A fatal error has been detected by the Java Runtime Environment: 
# SIGBUS (0x7) at pc=0x00007f458d078b7c, pid=13214, tid=139936990349120 
# JRE version: (8.0_40-b25) (build ) 
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.40-b25 mixed mode linux-amd64 compressed
# Problematic frame: 
# j java.lang.Object.<clinit>()V+0 
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit
-c unlimited" before starting Java again 
# An error report file with more information is saved as: 
# /tmp/hs_err_pid13214.log 
# If you would like to submit a bug report, please visit: 
# http://bugreport.java.com/bugreport/crash.jsp 

The root cause is a failure in jsvc. If we pass a greater than 1MB value as the stack size
argument, this can be mitigated.  Something like:

exec "$JSVC" \
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter "$@"

This JIRA tracks potential fixes for this problem. We don't have data on how this impacts
other applications that run on datanode as this might impact datanodes memory usage.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

View raw message