hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9969) Improve KeyValueHeap using loser tree
Date Thu, 14 Nov 2013 16:55:22 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822603#comment-13822603

Ted Yu commented on HBASE-9969:

Putting patch on review board would make reviewing easier.

For benchmark B, can you try ColumnPaginationFilter with smaller offset ?
Giving percentage improvement in the tables is desirable.
+ */
+public class LoserTree<T> {
Please add annotation for audience.
+   * {@code tree[i]} where i > 0 stores the index to greater value between {@code value[tree[2*i]]}
+   * and {@code value[tree[2*i + 1]]}.
'value[' should be 'values[', right ?
+   * @return the index to the minimal elements.
elements -> element
+   * Pushes next value from the stream that we previously taken the minimal element from.
taken -> took
+   * Passes {@code NULL} to value if the stream has been reached EOF.
Remove 'been'
+    if (index != topIndex()) {
+      throw new IllegalArgumentException("Only the top index can be updated");
Consider including index and topIndex in the exception message.
+    if (value == null && values.get(index) != null) {
+      numOpenStreams--;
+      if (numOpenStreams < 0) {
In what condition would numOpenStreams become negative ?
+        throw new AssertionError("numOpenStreams is negative: " + numOpenStreams);
Throw IllegalStateException.
+  public List<Integer> getOpenStreamsForTesting() {
The above is used in KeyValueHeap, consider renaming.
+   * from bottom up to the root. Once it "loses", it stops there and the winner continues
to fight to up.
+   */
+  private void update(int i) {
'fight to up' -> 'fight to top'
Please add @param for i.

KeyValueHeapBenchmark.java and TestLoserTree.java need license.

> Improve KeyValueHeap using loser tree
> -------------------------------------
>                 Key: HBASE-9969
>                 URL: https://issues.apache.org/jira/browse/HBASE-9969
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Chao Shi
>         Attachments: hbase-9969.patch, hbase-9969.patch, kvheap-benchmark.png, kvheap-benchmark.txt
> LoserTree is the better data structure than binary heap. It saves half of the comparisons
on each next(), though the time complexity is on O(logN).
> Currently A scan or get will go through two KeyValueHeaps, one is merging KVs read from
multiple HFiles in a single store, the other is merging results from multiple stores. This
patch should improve the both cases whenever CPU is the bottleneck (e.g. scan with filter
over cached blocks, HBASE-9811).
> All of the optimization work is done in KeyValueHeap and does not change its public interfaces.
The new code looks more cleaner and simpler to understand.

This message was sent by Atlassian JIRA

View raw message