hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-6912) HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage
Date Fri, 22 Aug 2014 02:31:10 GMT
Gopal V created HDFS-6912:
-----------------------------

             Summary: HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe
usage
                 Key: HDFS-6912
                 URL: https://issues.apache.org/jira/browse/HDFS-6912
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: caching
    Affects Versions: 2.5.0
         Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
            Reporter: Gopal V


The short-circuit reader throws SIGBUS errors from Unsafe code and crashes the JVM when tmpfs
on a disk is depleted.

{code}
---------------  T H R E A D  ---------------

Current thread (0x00007eff387df800):  JavaThread "xxx" daemon [_thread_in_vm, id=5880, stack(0x00007eff28b93000,0x00007eff28c94000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), si_addr=0x00007eff3e51d000
{code}

The entire backtrace of the JVM crash is

{code}
Stack: [0x00007eff28b93000,0x00007eff28c94000],  sp=0x00007eff28c90a10,  free space=1014k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x88232c]  Unsafe_GetLongVolatile+0x6c
j  sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0
j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8
j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4
j  org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70
j  org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38
j  org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100
j  org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102
j  org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18
j  org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151
j  org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46
j  org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230
j  org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal()Lorg/apache/hadoop/hdfs/BlockReader;+175
j  org.apache.hadoop.hdfs.BlockReaderFactory.build()Lorg/apache/hadoop/hdfs/BlockReader;+87
j  org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(J)Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;+291
j  org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(Lorg/apache/hadoop/hdfs/DFSInputStream$ReaderStrategy;II)I+83
j  org.apache.hadoop.hdfs.DFSInputStream.read([BII)I+15
{code}

This can be easily reproduced by starting the DataNode, filling up tmpfs (dd if=/dev/zero
bs=1M of=/dev/shm/dummy.zero) and running a simple task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message