hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5957) Provide support for different mmap cache retention policies in ShortCircuitCache.
Date Wed, 19 Feb 2014 01:14:19 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904932#comment-13904932

Colin Patrick McCabe commented on HDFS-5957:

bq. This usage pattern in combination with zero-copy read causes retention of a large number
of memory-mapped regions in the ShortCircuitCache. Eventually, YARN's resource check kills
the container process for exceeding the enforced physical memory bounds.

mmap regions don't consume physical memory.  They do consume virtual memory.

I don't think limiting virtual memory usage is a particularly helpful policy, and YARN should
stop doing that if that is in fact what it is doing.

bq. As a workaround, I advised Gopal to downtune dfs.client.mmap.cache.timeout.ms to make
the munmap happen more quickly. A better solution would be to provide support in the HDFS
client for a caching policy that fits this usage pattern.

In our tests, mmap provided no performance advantage unless it was reused.  If Gopal needs
to purge mmaps immediately after using them, the correct thing is simply not to use zero-copy

> Provide support for different mmap cache retention policies in ShortCircuitCache.
> ---------------------------------------------------------------------------------
>                 Key: HDFS-5957
>                 URL: https://issues.apache.org/jira/browse/HDFS-5957
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 2.3.0
>            Reporter: Chris Nauroth
> Currently, the {{ShortCircuitCache}} retains {{mmap}} regions for reuse by multiple reads
of the same block or by multiple threads.  The eventual {{munmap}} executes on a background
thread after an expiration period.  Some client usage patterns would prefer strict bounds
on this cache and deterministic cleanup by calling {{munmap}}.  This issue proposes additional
support for different caching policies that better fit these usage patterns.

This message was sent by Atlassian JIRA

View raw message