hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9969) Improve KeyValueHeap using loser tree
Date Thu, 14 Nov 2013 16:55:22 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822603#comment-13822603
] 

Ted Yu commented on HBASE-9969:
-------------------------------

Putting patch on review board would make reviewing easier.

For benchmark B, can you try ColumnPaginationFilter with smaller offset ?
Giving percentage improvement in the tables is desirable.
{code}
+ */
+public class LoserTree<T> {
{code}
Please add annotation for audience.
{code}
+   * {@code tree[i]} where i > 0 stores the index to greater value between {@code value[tree[2*i]]}
+   * and {@code value[tree[2*i + 1]]}.
{code}
'value[' should be 'values[', right ?
{code}
+   * @return the index to the minimal elements.
{code}
elements -> element
{code}
+   * Pushes next value from the stream that we previously taken the minimal element from.
{code}
taken -> took
{code}
+   * Passes {@code NULL} to value if the stream has been reached EOF.
{code}
Remove 'been'
{code}
+    if (index != topIndex()) {
+      throw new IllegalArgumentException("Only the top index can be updated");
{code}
Consider including index and topIndex in the exception message.
{code}
+    if (value == null && values.get(index) != null) {
+      numOpenStreams--;
+      if (numOpenStreams < 0) {
{code}
In what condition would numOpenStreams become negative ?
{code}
+        throw new AssertionError("numOpenStreams is negative: " + numOpenStreams);
{code}
Throw IllegalStateException.
{code}
+  public List<Integer> getOpenStreamsForTesting() {
{code}
The above is used in KeyValueHeap, consider renaming.
{code}
+   * from bottom up to the root. Once it "loses", it stops there and the winner continues
to fight to up.
+   */
+  private void update(int i) {
{code}
'fight to up' -> 'fight to top'
Please add @param for i.

KeyValueHeapBenchmark.java and TestLoserTree.java need license.


> Improve KeyValueHeap using loser tree
> -------------------------------------
>
>                 Key: HBASE-9969
>                 URL: https://issues.apache.org/jira/browse/HBASE-9969
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Chao Shi
>         Attachments: hbase-9969.patch, hbase-9969.patch, kvheap-benchmark.png, kvheap-benchmark.txt
>
>
> LoserTree is the better data structure than binary heap. It saves half of the comparisons
on each next(), though the time complexity is on O(logN).
> Currently A scan or get will go through two KeyValueHeaps, one is merging KVs read from
multiple HFiles in a single store, the other is merging results from multiple stores. This
patch should improve the both cases whenever CPU is the bottleneck (e.g. scan with filter
over cached blocks, HBASE-9811).
> All of the optimization work is done in KeyValueHeap and does not change its public interfaces.
The new code looks more cleaner and simpler to understand.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message