hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars George (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1829) Make use of start/stop row in TableInputFormat
Date Thu, 17 Sep 2009 06:43:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756385#action_12756385

Lars George commented on HBASE-1829:

You are right Michael, it cleans up some remnants from when we could have different numbers
of splits. It also attempts to reduce the split count to the number of regions that include
start and stop row. The idea with the comparison is to find the start key of the region just
below the start row and the end key of the region just after the stop row. 

I am not sure about the default empty end row and also the comparison in terms of equal or
equal and greater etc. I just thought I get the patch up as an idea I had but it is not yet
tested. I will test it early next week an sort out the issues.

Question is there a testbed that allows to have say 3-4 regions so that I can construct various
test cases (like start/stop row both in first/last region, spanning all regions, crossing
only two regions etc.)? I am not too familiar with the test classes and I know you guys changing
things around. What would be a good sample to start with?

Otherwise I will test it on my live cluster that has more than enough to test with. But a
unit test seems like a good idea.

> Make use of start/stop row in TableInputFormat
> ----------------------------------------------
>                 Key: HBASE-1829
>                 URL: https://issues.apache.org/jira/browse/HBASE-1829
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Lars George
>            Assignee: Lars George
>            Priority: Minor
>             Fix For: 0.20.1
>         Attachments: HBASE-1829.patch
> Since we can now specify a start and stop row with the Scan that is handed to the TIF
we can reduce the splits to the regions that contain these rows. That allows to test large
MR jobs on a single region for example.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message