hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liang Xie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10052) use HDFS advisory caching to avoid caching HFiles that are not going to be read again (because they are being compacted)
Date Sat, 10 May 2014 22:15:09 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993417#comment-13993417
] 

Liang Xie commented on HBASE-10052:
-----------------------------------

bq.  One thing to be wary of: during the compaction, readers are still accessing the old files,
so if you're compacting large files, this could really hurt read latency during compactions
(assuming that people are relying on linux LRU in addition to hbase-internal LRU for performance).
Since by default we has 3 replicas in HDFS layer, the current InputStream drops caching against
the only 1 picked up replica, seems not ideal considering the possible redundant caching on
multi nodes if a failover or sth happened. How about providing an async function in InputStream
layer, say dropFileCaches, getting all LocatedBlocks, and expose a similar interface in dn
layer as well, then clear all caching in all dns for those blocks.
we can request this async dropFileCaches just before closing the original store files be compacted.
 Just a raw idea, crazy? :)


> use HDFS advisory caching to avoid caching HFiles that are not going to be read again
(because they are being compacted)
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-10052
>                 URL: https://issues.apache.org/jira/browse/HBASE-10052
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Colin Patrick McCabe
>            Priority: Minor
>             Fix For: 0.99.0, 0.98.3
>
>
> HBase can benefit from doing dropbehind during compaction since compacted files are not
read again.  HDFS advisory caching, introduced in HDFS-4817, can help here.  The right API
here is {{DataInputStream#setDropBehind}}.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message