hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2517) During reads when passed the specified time range, seek to next column
Date Thu, 15 Jul 2010 08:49:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888750#action_12888750
] 

HBase Review Board commented on HBASE-2517:
-------------------------------------------

Message from: "Pranav Khaitan" <pranavkhaitan@facebook.com>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/323/
-----------------------------------------------------------

Review request for hbase, Jonathan Gray, Karthik Ranganathan, and Kannan Muthukkaruppan.


Summary
-------

This patch addresses the following issues:

1. After it is done with reading the required timestamps, the QueryMatcher should return a
NEXT_COL so that it doesn't keep on reading every kv till the end of the column. 

2. Before returning NEXT_COL, it also checks if any further columns are required. If no columns
are required, then it returns NEXT_ROW instead of returning NEXT_COL. This saves significant
time and another round of iteration.

3. Before seeking to NEXT_ROW, we check if we are already on the last row. If we are on the
last row, then we can return false. This avoids one more call to next() and saves times.

4. Provides useful input for HBase-2450 and HBase-1517 which can take advantage of these return
codes.

5. Optimizes Get queries with only one column.

6. Fixing a bug which occurred when versions were processed before filters were applied.

7. If we know (using filters/timestamps) that we don't need any more keys for a particular
column, then there should be a mechanism to send this information to ExplicitColumnTracker.


This addresses bug HBASE-2517.
    http://issues.apache.org/jira/browse/HBASE-2517


Diffs
-----

  trunk/src/main/java/org/apache/hadoop/hbase/io/TimeRange.java 963961 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 963961

  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java 963961 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 963961 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 963961 
  trunk/src/test/java/org/apache/hadoop/hbase/client/TestMultipleTimestamps.java 963961 

Diff: http://review.hbase.org/r/323/diff


Testing
-------

Existing tests run successfully with some of them going through the modified code path. Added
specialized unit tests for this purpose. Did manual debugging to see if the optimization is
being done and correct match codes are being returned. 


Thanks,

Pranav




> During reads when passed the specified time range, seek to next column
> ----------------------------------------------------------------------
>
>                 Key: HBASE-2517
>                 URL: https://issues.apache.org/jira/browse/HBASE-2517
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Jonathan Gray
>            Assignee: Pranav Khaitan
>             Fix For: 0.90.0
>
>
> When we are processing the stream of KeyValues in the ScanQueryMatcher, we will check
the timestamp of the current KV against the specific TimeRange.  Currently we only check if
it is in the range or not, returning SKIP if outside the range or continuing to other checks
if within the range.
> The check should actually return SKIP if the stamp is greater than the TimeRange and
NEXT_COL if the stamp is less than the TimeRange (we know we won't take anymore columns from
the current column once we are below the TimeRange).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message