hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache
Date Sat, 18 Feb 2012 01:48:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210742#comment-13210742
] 

Phabricator commented on HBASE-5347:
------------------------------------

mbautin has commented on the revision "[HBASE-5347] [jira] GC free memory management in Level-1
Block Cache".

  Looks pretty good (provided that it works as intended). Some comments inline.

  Could you please add some description of the pinning/unpinning and ref/deref concept (and
the difference between the two) somewhere in javadoc? Otherwise a lot of this code is very
difficult to read. The added complexity is justified by the performance benefit, but we will
have to maintain this code for a long time, so it should be as self-documenting as possible
(think about new developers joining the project).

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:2180 Remove this line
  src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListElement.java:1 License
  src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListHead.java:1 License
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:1222 It would be useful to mention in
the javadoc that the KV is being dereferenced in this method because it does not refer to
the backing array of an HFile block anymore.
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:2091 Could you put the code specific
to reference counting code into some other class? KeyValue is already huge.
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:2141-2186 Make these counters final.
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:2192 Remove this line and other commented-out
code in this function.
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:2222 Is this increment thread-safe?
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:2237 Naming conventions: listNum.

  A more serious comment: is it possible that one thread will call ref() much more often?
If so, wouldn't it be better to load-balance between in-use lists randomly? Note that with
the current solution we still have to lock at an individual list level, because multiple threads
can still map to the same in-use list. But I agree that hashing on thread id decreases the
amount of contention.
  src/main/java/org/apache/hadoop/hbase/client/HTable.java:623-658 • It might not be a good
idea to add test-only methods to HTable, because that class is used directly by a lot of clients.
In other words, clients pass around HTable instances and don't cast them to HTableInterface.

  I think we should move these methods to a test-only class and/or make them package-private.

  • Replace "anyrandomrow" with a constant.
  src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java:112 This method name change
is not terribly useful without an explanation of what "pinned" is. Would it be better to explain
this in javadoc and keep the old name? This also applies to all other changes of method names
including the word "Pinned" or "AndPin".
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java:39 Are these name changes
necessary? Maybe it is better to just add javadoc explaining the "pinning" part.
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java:322 Nice refactoring
  src/main/java/org/apache/hadoop/hbase/io/hfile/Cacheable.java:39-43 We definitely need javadoc
describing what pin/unpin is and how to use it.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:373 What exactly is pinned here,
and how would the caller unpin it, if that's the intent of the method name change? The return
value is an output stream.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:119 Make this final?
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:168 initialize2 is not a
very descriptive counter name
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1092 Is it possible to separate
"pool allocator" interface from implementation, rather than just calling a static method?
This way we could swap in alternative allocators in tests.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1499 Please move the pool
code to a separate class. Also, it would be nice to separate pool allocator interface from
implementation, as I mentioned.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java:196 Should these be
IOExceptions, since the method already throws them?
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:412 "Inx" is a very rare
abbreviation for "index". Better to spell out "Index".
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:694-695 Make final
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1628-1669 Make these final.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:408 Would the additional
copy cause a performance regression based on this method's use cases?
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:417 Would the additional
copy cause a performance regression based on this method's use cases?
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java:128 add a "}" at the end
of the "{@link KeyValue"
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java:127 in any HFileBlock
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java:390-391 Make final
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:2874-2876 Make final
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3071 Fix formatting ("i
=0").
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:2812-2878 This has
to be in a separate class.
  src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java:410-411 Make final
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:1404 Do we unpin on close?
As far as I remember, this used to be a wrapper over a ByteArrayInputStream.
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java:399-400 Make final
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:547-548 Make final
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java:37 Why
would we need to log into this class's logger from a subclass? Please define another logger
in the subclass directly instead.
  src/main/java/org/apache/hadoop/hbase/util/Counters.java:1 License
  src/main/java/org/apache/hadoop/hbase/util/Counters.java:25 Naming convention: rename to
COUNTER_CLASSES
  src/main/java/org/apache/hadoop/hbase/util/Counters.java:23 This class probably needs a
better name, since it is very specific kind of counters accessed through reflection.
  src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListElement.java:1 License
  src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListHead.java:1 License
  src/main/java/org/apache/hadoop/hbase/util/DoublyLinkedListHead.java:4 Make private?
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java:83 Most changes in
this class and a lot of other changes in this diff are simply name changes. Not sure if they
are necessary. One advantage is definitely in reminding the caller to unpin blocks.

REVISION DETAIL
  https://reviews.facebook.net/D1635

                
> GC free memory management in Level-1 Block Cache
> ------------------------------------------------
>
>                 Key: HBASE-5347
>                 URL: https://issues.apache.org/jira/browse/HBASE-5347
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Prakash Khemani
>            Assignee: Prakash Khemani
>         Attachments: D1635.5.patch
>
>
> On eviction of a block from the block-cache, instead of waiting for the garbage collecter
to reuse its memory, reuse the block right away.
> This will require us to keep reference counts on the HFile blocks. Once we have the reference
counts in place we can do our own simple blocks-out-of-slab allocation for the block-cache.
> This will help us with
> * reducing gc pressure, especially in the old generation
> * making it possible to have non-java-heap memory backing the HFile blocks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message