Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AAA3118DCA for ; Mon, 4 Jan 2016 23:23:40 +0000 (UTC) Received: (qmail 26209 invoked by uid 500); 4 Jan 2016 23:23:40 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 26082 invoked by uid 500); 4 Jan 2016 23:23:40 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 26036 invoked by uid 99); 4 Jan 2016 23:23:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jan 2016 23:23:40 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 1D1BC2C1F60 for ; Mon, 4 Jan 2016 23:23:40 +0000 (UTC) Date: Mon, 4 Jan 2016 23:23:40 +0000 (UTC) From: "Chris Trezzo (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081991#comment-15081991 ] Chris Trezzo commented on HDFS-8578: ------------------------------------ Hi [~szetszwo], thanks for the updated patch! One comment so far: I think that we can get rid of the SubmissionService class and directly use an ExecutorService instead. You can use the shutdown and awaitTermination methods to wait for all of the doUpgrade tasks to complete. This way we will not need an extra class or to keep track of the number of tasks submitted. You might need to pass one more list of futures into the methods so that when the callables are submitted we can keep track of the futures to fill the success list at the end. I am not totally convinced that we need this yet though. > On upgrade, Datanode should process all storage/data dirs in parallel > --------------------------------------------------------------------- > > Key: HDFS-8578 > URL: https://issues.apache.org/jira/browse/HDFS-8578 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Reporter: Raju Bairishetti > Assignee: Vinayakumar B > Priority: Critical > Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-08.patch, HDFS-8578-09.patch, HDFS-8578-10.patch, HDFS-8578-11.patch, HDFS-8578-12.patch, HDFS-8578-13.patch, HDFS-8578-14.patch, HDFS-8578-15.patch, HDFS-8578-16.patch, HDFS-8578-17.patch, HDFS-8578-branch-2.6.0.patch, HDFS-8578-branch-2.7-001.patch, HDFS-8578-branch-2.7-002.patch, HDFS-8578-branch-2.7-003.patch, h8578_20151210.patch, h8578_20151211.patch, h8578_20151211b.patch, h8578_20151212.patch, h8578_20151213.patch > > > Right now, during upgrades datanode is processing all the storage dirs sequentially. Assume it takes ~20 mins to process a single storage dir then datanode which has ~10 disks will take around 3hours to come up. > *BlockPoolSliceStorage.java* > {code} > for (int idx = 0; idx < getNumStorageDirs(); idx++) { > doTransition(datanode, getStorageDir(idx), nsInfo, startOpt); > assert getCTime() == nsInfo.getCTime() > : "Data-node and name-node CTimes must be the same."; > } > {code} > It would save lots of time during major upgrades if datanode process all storagedirs/disks parallelly. > Can we make datanode to process all storage dirs parallelly? -- This message was sent by Atlassian JIRA (v6.3.4#6332)