hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fenghua Hu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java
Date Wed, 10 Aug 2016 01:30:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407012#comment-15407012
] 

Fenghua Hu edited comment on HDFS-10690 at 8/10/16 1:30 AM:
------------------------------------------------------------

Xiaoyu,

[~xyao]I tried to replace TreeMap with linkedHashMap, but found LinkedHashMap lacks of function
"ceilingEntry" or similar alternative, which is key to implement LRU-based replacement algorithm.
LinkedHashMap also can't provide getYoungest or getEldest or similar functions. That's to
say, if we want to use LinkedHashMap, we actually need to rewrite it. Any comments? Thanks.

Finally i found the correct email for you:-)





was (Author: fenghua_hu):
Xiaoyu,

[~xiaoyuyao] I tried to replace TreeMap with linkedHashMap, but found LinkedHashMap lacks
of function "ceilingEntry" or similar alternative, which is key to implement LRU-based replacement
algorithm. LinkedHashMap also can't provide getYoungest or getEldest or similar functions.
That's to say, if we want to use LinkedHashMap, we actually need to rewrite it. Any comments?
Thanks.




> Optimize insertion/removal of replica in ShortCircuitCache.java
> ---------------------------------------------------------------
>
>                 Key: HDFS-10690
>                 URL: https://issues.apache.org/jira/browse/HDFS-10690
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Fenghua Hu
>            Assignee: Fenghua Hu
>         Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently in ShortCircuitCache, two TreeMap objects are used to track the cached replicas.
> private final TreeMap<Long, ShortCircuitReplica> evictable = new TreeMap<>();
> private final TreeMap<Long, ShortCircuitReplica> evictableMmapped = new TreeMap<>();
> TreeMap employs Red-Black tree for sorting. This isn't an issue when using traditional
HDD. But when using high-performance SSD/PCIe Flash, the cost inserting/removing an entry
 becomes considerable.
> To mitigate it, we designed a new list-based for replica tracking.
> The list is a double-linked FIFO. FIFO is time-based, thus insertion is a very low cost
operation. On the other hand, list is not lookup-friendly. To address this issue, we introduce
two references into ShortCircuitReplica object.
> ShortCircuitReplica next = null;
> ShortCircuitReplica prev = null;
> In this way, lookup is not needed when removing a replica from the list. We only need
to modify its predecessor's and successor's references in the lists.
> Our tests showed up to 15-50% performance improvement when using PCIe flash as storage
media.
> The original patch is against 2.6.4, now I am porting to Hadoop trunk, and patch will
be posted soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message