hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5810) Unify mmap cache and short-circuit file descriptor cache
Date Fri, 07 Feb 2014 23:15:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895233#comment-13895233
] 

Colin Patrick McCabe commented on HDFS-5810:
--------------------------------------------

bq. If the mmap and normal caches are configured to a small and a big value respectively...Can
we unlink the two caches completely?

I guess it seemed elegant to me at the time: {{dfs.client.mmap.cache.timeout.ms}} controls
how long something is mmapped, but even after the mmap is gone, we honor {{dfs.client.read.shortcircuit.streams.cache.expiry.ms}}.
  Nothing is wasted... we reuse stuff.  Maybe we just make this clearer in the docs?  What
do you think.

bq. For the tracing, my goal is to avoid runtime overhead if tracing is off. I think us savvy
HDFS developers could figure out how to read a compressed stack trace, e.g. just showing the
names of methods. Another alternative is putting the trace tag in a thread-local, maybe with
a unique ID.

I don't think the strings will create that much runtime overhead.  They're constant, which
means they will probably be stored in some permanent memory section that never gets GC'ed.
 I agree that it would look a little "cleaner" without them, but the knowledge of who called
what really saved my bacon during some debugging sessions, so I'm pretty resistant to removing
them completely.  Thread-locals seem messy for this.  I'll think about what the options are
here... maybe a stack trace wouldn't be that bad?  Using stack traces would remove that "caller"
parameter, but they might make things a bit harder to read when using TRACE.  I didn't change
it in the latest patch...

bq. On the unref/purge, I missed that the refcount starts off at 2, which is unusual. I'd
like to see the SCCache ref done explicitly in SCCache, or at least documented in SCCache
too.

I added a comment in {{ShortCircuitCache#create}} explaining why we don't need to ref a new
replica.

bq. On timeouts, we might want to be even more aggressive at increasing them. A recent FB
HBase/HDFS paper published at FAST 14 shows that caching on the order of hours to provide
continued benefit. The inflection point is something like 1 or 2 hours, so that'd be my target.
IIUC, allocMmap has pressure-based eviction, so we can tolerate it.

OK, let me increase the mmap cache timeout to an hour.

bq. Need timeouts for new tests (I just checked via grep in all changed/new test files)

Added.

bq. ...

bq. rename "SCReplica#checkStale" to "isStale" per java/hadoop convention

OK

bq. What's the rationale for the extremely polymorphic Object pointer in SCReplica? All the
casting is messy

I guess the rationale is that we only use one of {condition variable, last failure time, clientmmap
object} at a time.  I can take a look at whether this could be improved.  I kind of like the
way the code flows in getOrCreateClientMmap now, though.  Maybe it could be improved, though...

bq. Do we really need to release and retake the lock for making an mmap? I didn't think this
was expensive enough to warrant this, and I noticed you didn't do this when unmapping.

mmap is an expensive operation, and this is global lock we're dealing with here.  The existing
code does this I/O after releasing the lock, and I wouldn't want to move backwards here.

bq. allocMmap, we decrement outstandingMmapCount, but isn't that already handled by the munmap
in the line before? If this is legit, a verifying test case would be good.

Yes, this was a bug.  I didn't originally have a test for verifying the 'mmap scavenging'
code path where we have to unmap and existing mmap.  Let me add one now.

> Unify mmap cache and short-circuit file descriptor cache
> --------------------------------------------------------
>
>                 Key: HDFS-5810
>                 URL: https://issues.apache.org/jira/browse/HDFS-5810
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 2.3.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5810.001.patch, HDFS-5810.004.patch, HDFS-5810.006.patch, HDFS-5810.008.patch,
HDFS-5810.015.patch, HDFS-5810.016.patch
>
>
> We should unify the client mmap cache and the client file descriptor cache.  Since mmaps
are granted corresponding to file descriptors in the cache (currently FileInputStreamCache),
they have to be tracked together to do "smarter" things like HDFS-5182.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message