Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 79B89200B50 for ; Sat, 30 Jul 2016 03:08:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 78651160AA7; Sat, 30 Jul 2016 01:08:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BE75F160A79 for ; Sat, 30 Jul 2016 03:08:21 +0200 (CEST) Received: (qmail 33387 invoked by uid 500); 30 Jul 2016 01:08:20 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 33358 invoked by uid 99); 30 Jul 2016 01:08:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Jul 2016 01:08:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id BC3C32C0D61 for ; Sat, 30 Jul 2016 01:08:20 +0000 (UTC) Date: Sat, 30 Jul 2016 01:08:20 +0000 (UTC) From: "Fenghua Hu (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sat, 30 Jul 2016 01:08:22 -0000 [ https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400336#comment-15400336 ] Fenghua Hu commented on HDFS-10690: ----------------------------------- Xiaoyu, thanks for the reply. Regarding the bulletin 2, look like i didn't explain the design very well. Sorry for the misleading. I'd like to clarify here. This design won't need lookup in link list, because there are two references in ShortCircuitReplica object. If we want to remove a ShortCircuitReplica object from the list, just directly access its references and unlink itself. That's why it can improve performance. > Optimize insertion/removal of replica in ShortCircuitCache.java > --------------------------------------------------------------- > > Key: HDFS-10690 > URL: https://issues.apache.org/jira/browse/HDFS-10690 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Affects Versions: 3.0.0-alpha2 > Reporter: Fenghua Hu > Assignee: Fenghua Hu > Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Currently in ShortCircuitCache, two TreeMap objects are used to track the cached replicas. > private final TreeMap evictable = new TreeMap<>(); > private final TreeMap evictableMmapped = new TreeMap<>(); > TreeMap employs Red-Black tree for sorting. This isn't an issue when using traditional HDD. But when using high-performance SSD/PCIe Flash, the cost inserting/removing an entry becomes considerable. > To mitigate it, we designed a new list-based for replica tracking. > The list is a double-linked FIFO. FIFO is time-based, thus insertion is a very low cost operation. On the other hand, list is not lookup-friendly. To address this issue, we introduce two references into ShortCircuitReplica object. > ShortCircuitReplica next = null; > ShortCircuitReplica prev = null; > In this way, lookup is not needed when removing a replica from the list. We only need to modify its predecessor's and successor's references in the lists. > Our tests showed up to 15-50% performance improvement when using PCIe flash as storage media. > The original patch is against 2.6.4, now I am porting to Hadoop trunk, and patch will be posted soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org