impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Russell (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4623: [DOCS] Document file handle caching
Date Thu, 05 Oct 2017 20:48:03 GMT
John Russell has posted comments on this change. ( )

Change subject: IMPALA-4623: [DOCS] Document file handle caching

Patch Set 1:

File docs/topics/impala_scalability.xml:
PS1, Line 967: although the encryption layer
             :         adds overhead that might lessen the benefit of the caching.
> I'm not familiar with this overhead. What is this referring to?
I had written in the notes from our conversation HDFS encryption adds overhead". >From
when we were thinking about all the other complicating factors, like Sentry GRANT/REVOKE.
PS1, Line 973: 20 thousand
> Just curious: How do you decide to use "20 thousand" vs "20,000"?
For big numbers, I try to stick with either spelled-out forms or obvious powers of 2. (Like
I would say 65536 with no comma.) There are so many other separator conventions internationally
( I don't want to be too
PS1, Line 991: evict any stale file handles from the cache
> The file handles won't actually be evicted directly. The new metadata will 
PS1, Line 995: To evaluate the effectiveness of file handle caching for a particular workload,
issue the
             :         <codeph>PROFILE</codeph> statement in <cmdname>impala-shell</cmdname>
or examine query
             :         profiles in the Impala web UI. Look for the ratio of <codeph>CachedFileHandlesHitCount</codeph>
             :         (ideally, should be high) to <codeph>CachedFileHandlesMissCount</codeph>
(ideally, should be low).
             :         Before starting any evaluation, run some representative queries to
<q>warm up</q> the cache,
             :         because the first time each data file is accessed is always recorded
as a cache miss.
> I'm not sure this belongs here, but information about the cache across the 
Let's be inclusive for this first iteration and then fine-tune later if needed. We tend to
be skimpy with such information which is a weakness IMO.

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I261c29eff80dc376528bba29ffb7d8e0f895e25f
Gerrit-Change-Number: 8200
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Joe McDonnell <>
Gerrit-Reviewer: John Russell <>
Gerrit-Reviewer: Mostafa Mokhtar <>
Gerrit-Comment-Date: Thu, 05 Oct 2017 20:48:03 +0000
Gerrit-HasComments: Yes

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message