Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id BC15E200D15 for ; Wed, 20 Sep 2017 09:09:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id BA9691609E1; Wed, 20 Sep 2017 07:09:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 04B061609E2 for ; Wed, 20 Sep 2017 09:09:06 +0200 (CEST) Received: (qmail 89046 invoked by uid 500); 20 Sep 2017 07:09:06 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 88831 invoked by uid 99); 20 Sep 2017 07:09:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Sep 2017 07:09:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 4AFBC184074 for ; Wed, 20 Sep 2017 07:09:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id h_cilouSGOBU for ; Wed, 20 Sep 2017 07:09:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 062C561132 for ; Wed, 20 Sep 2017 07:09:03 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 46D6AE0F10 for ; Wed, 20 Sep 2017 07:09:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 075C92190B for ; Wed, 20 Sep 2017 07:09:01 +0000 (UTC) Date: Wed, 20 Sep 2017 07:09:01 +0000 (UTC) From: "liumi (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-12487) FsDatasetSpi.isValidBlock() lacks null pointer check inside and neither do the callers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 20 Sep 2017 07:09:07 -0000 [ https://issues.apache.org/jira/browse/HDFS-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liumi updated HDFS-12487: ------------------------- Status: Open (was: Patch Available) > FsDatasetSpi.isValidBlock() lacks null pointer check inside and neither do the callers > -------------------------------------------------------------------------------------- > > Key: HDFS-12487 > URL: https://issues.apache.org/jira/browse/HDFS-12487 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover, diskbalancer > Affects Versions: 3.0.0 > Environment: CentOS 6.8 x64 > CPU:4 core > Memory:16GB > Hadoop: Release 3.0.0-alpha4 > Reporter: liumi > Assignee: liumi > Fix For: 3.1.0 > > Attachments: HDFS-12487.001.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > BlockIteratorImpl.nextBlock() will look for the blocks in the source volume, if there are no blocks any more, it will return null up to DiskBalancer.getBlockToCopy(). However, the DiskBalancer.getBlockToCopy() will check whether it's a valid block. > When I look into the FsDatasetSpi.isValidBlock(), I find that it doesn't check the null pointer! In fact, we firstly need to check whether it's null or not, or exception will occur. > This bug is hard to find, because the DiskBalancer hardly copy all the data of one volume to others. Even if some times we may copy all the data of one volume to other volumes, when the bug occurs, the copy process has already done. > However, when we try to copy all the data of two or more volumes to other volumes in more than one step, the thread will be shut down, which is caused by the bug above. > The bug can fixed by two ways: > 1)Before the call of FsDatasetSpi.isValidBlock(), we check the null pointer > 2)Check the null pointer inside the implementation of FsDatasetSpi.isValidBlock() -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org