hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-18203) Intelligently manage a pool of open references to store files
Date Fri, 09 Jun 2017 17:40:18 GMT
Andrew Purtell created HBASE-18203:

             Summary: Intelligently manage a pool of open references to store files
                 Key: HBASE-18203
                 URL: https://issues.apache.org/jira/browse/HBASE-18203
             Project: HBase
          Issue Type: Improvement
          Components: regionserver
    Affects Versions: 2.0.0
            Reporter: Andrew Purtell

When bringing a region online we open every store file and keep the file open, to avoid further
round trips to the HDFS namenode during reads. Naively keeping open every store file we encounter
is a bad idea. There should be an upper bound. We should close and reopen files as needed
once we are above the upper bound. We should choose candidates to close on a LRU basis. Otherwise
we can (and some users have in production) overrun high (~64k) open file handle limits on
the server if the aggregate number of store files is too large. 

Note the 'open files' here refers to open/active references to files at the HDFS level. How
this maps to active file descriptors at the OS level depends on concurrency of access (block
transfers, short circuit reads). The more open files we have at the HDFS level the higher
number of OS level file handles we can expect to consume.

This message was sent by Atlassian JIRA

View raw message