Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8B83D200D1A for ; Mon, 9 Oct 2017 19:34:12 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 89EBB1609CE; Mon, 9 Oct 2017 17:34:12 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A95AE1609B8 for ; Mon, 9 Oct 2017 19:34:11 +0200 (CEST) Received: (qmail 56383 invoked by uid 500); 9 Oct 2017 17:34:05 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 56370 invoked by uid 99); 9 Oct 2017 17:34:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Oct 2017 17:34:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id E7F361A4F70 for ; Mon, 9 Oct 2017 17:34:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.201 X-Spam-Level: X-Spam-Status: No, score=-99.201 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_SHORT=0.001, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id jzvaTwcngiv1 for ; Mon, 9 Oct 2017 17:34:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 30CD15FD67 for ; Mon, 9 Oct 2017 17:34:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B06C5E0E18 for ; Mon, 9 Oct 2017 17:34:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5DC0E24366 for ; Mon, 9 Oct 2017 17:34:00 +0000 (UTC) Date: Mon, 9 Oct 2017 17:34:00 +0000 (UTC) From: "Hudson (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-12606) When using native decoder, DFSStripedStream#close crashes JVM after being called multiple times. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 09 Oct 2017 17:34:12 -0000 [ https://issues.apache.org/jira/browse/HDFS-12606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197360#comment-16197360 ] Hudson commented on HDFS-12606: ------------------------------- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13052 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13052/]) HDFS-12606. When using native decoder, DFSStripedStream.close crashes (lei: rev 46644319e1b3295ddbc7597c060956bf46487d11) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedInputStream.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedInputStream.java > When using native decoder, DFSStripedStream#close crashes JVM after being called multiple times. > ------------------------------------------------------------------------------------------------ > > Key: HDFS-12606 > URL: https://issues.apache.org/jira/browse/HDFS-12606 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding > Affects Versions: 3.0.0-beta1 > Reporter: Lei (Eddy) Xu > Assignee: Lei (Eddy) Xu > Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-12606.00.patch > > > When running NNbench on a RS(6,3) directory, JVM crashes double free or corruption: > {code} > 08:16:29 Running NNBENCH. > 08:16:29 WARNING: Use "yarn jar" to launch YARN applications. > 08:16:31 NameNode Benchmark 0.4 > 08:16:31 17/10/04 08:16:31 INFO hdfs.NNBench: Test Inputs: > 08:16:31 17/10/04 08:16:31 INFO hdfs.NNBench: Test Operation: create_write > 08:16:31 17/10/04 08:16:31 INFO hdfs.NNBench: Start time: 2017-10-04 08:18:31,16 > : > : > 08:18:54 *** Error in `/usr/java/jdk1.8.0_144/bin/java': double free or corruption (out): 0x00007ffb55dbfab0 *** > 08:18:54 ======= Backtrace: ========= > 08:18:54 /lib64/libc.so.6(+0x7c619)[0x7ffb5b85f619] > 08:18:54 [0x7ffb45017774] > 08:18:54 ======= Memory map: ======== > 08:18:54 00400000-00401000 r-xp 00000000 ca:01 276832134 /usr/java/jdk1.8.0_144/bin/java > 08:18:54 00600000-00601000 rw-p 00000000 ca:01 276832134 /usr/java/jdk1.8.0_144/bin/java > 08:18:54 0173e000-01f91000 rw-p 00000000 00:00 0 [heap] > 08:18:54 603600000-614700000 rw-p 00000000 00:00 0 > 08:18:54 614700000-72bd00000 ---p 00000000 00:00 0 > 08:18:54 72bd00000-73a500000 rw-p 00000000 00:00 0 > 08:18:54 73a500000-7c0000000 ---p 00000000 00:00 0 > 08:18:54 7c0000000-7c0400000 rw-p 00000000 00:00 0 > 08:18:54 7c0400000-800000000 ---p 00000000 00:00 0 > 08:18:54 7ffb20174000-7ffb208ab000 rw-p 00000000 00:00 0 > 08:18:54 7ffb208ab000-7ffb20975000 ---p 00000000 00:00 0 > 08:18:54 7ffb20975000-7ffb20b75000 rw-p 00000000 00:00 0 > 08:18:54 7ffb20b75000-7ffb20d75000 rw-p 00000000 00:00 0 > 08:18:54 7ffb20d75000-7ffb20d8a000 r-xp 00000000 ca:01 209866 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 > 08:18:54 7ffb20d8a000-7ffb20f89000 ---p 00015000 ca:01 209866 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 > 08:18:54 7ffb20f89000-7ffb20f8a000 r--p 00014000 ca:01 209866 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 > 08:18:54 7ffb20f8a000-7ffb20f8b000 rw-p 00015000 ca:01 209866 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 > 08:18:54 7ffb20f8b000-7ffb20fbd000 r-xp 00000000 ca:01 553654092 /usr/java/jdk1.8.0_144/jre/lib/amd64/libsunec.so > 08:18:54 7ffb20fbd000-7ffb211bc000 ---p 00032000 ca:01 553654092 /usr/java/jdk1.8.0_144/jre/lib/amd64/libsunec.so > 08:18:54 7ffb211bc000-7ffb211c2000 rw-p 00031000 ca:01 553654092 /usr/java/jdk1.8.0_144/jre/lib/amd64/libsunec.so > : > : > 08:18:54 7ffb5c3fb000-7ffb5c3fc000 r--p 00000000 00:00 0 > 08:18:54 7ffb5c3fc000-7ffb5c3fd000 rw-p 00000000 00:00 0 > 08:18:54 7ffb5c3fd000-7ffb5c3fe000 r--p 00021000 ca:01 637266 /usr/lib64/ld-2.17.so > 08:18:54 7ffb5c3fe000-7ffb5c3ff000 rw-p 00022000 ca:01 637266 /usr/lib64/ld-2.17.so > 08:18:54 7ffb5c3ff000-7ffb5c400000 rw-p 00000000 00:00 0 > 08:18:54 7ffdf8767000-7ffdf8788000 rw-p 00000000 00:00 0 [stack] > 08:18:54 7ffdf878b000-7ffdf878d000 r-xp 00000000 00:00 0 [vdso] > 08:18:54 ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] > {code} > It happens on both {{jdk1.8.0_144}} and {{jdk1.8.0_121}} in our environments. > It is highly suspicious due to the native code used in erasure coding, i.e., ISA-L is not thread safe [https://01.org/sites/default/files/documentation/isa-l_open_src_2.10.pdf] -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org