hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range
Date Thu, 01 Mar 2012 20:28:03 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220327#comment-13220327

jiraposter@reviews.apache.org commented on HBASE-5489:

bq.  On 2012-03-01 06:44:36, Lars Hofhansl wrote:
bq.  > Looks good to me.
bq.  > Curious: Do you have a specific usecase in mind for this API?
bq.  David Wang wrote:
bq.      Yes, I would like to not have to be forced to scan .META. everytime my client just
wants the regions for a particular range, and that information is already cached in the client.
 This is also more convenient for the caller than having to parse through all of the start/end
keys in the table everytime.
bq.  Lars Hofhansl wrote:
bq.      Wait. TableInputFormat is already configured with a Scan object, which do exactly
the same thing (via a scanner).
bq.      You don't special InputFormat for this.

Sorry, that last was in response to your email where you say that you want to "make a TableInputFormat
equivalent that only scans a sub-range of the table"

- Lars

This is an automatically generated e-mail. To reply, visit:

On 2012-03-01 18:24:18, David Wang wrote:
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4117/
bq.  -----------------------------------------------------------
bq.  (Updated 2012-03-01 18:24:18)
bq.  Review request for hbase.
bq.  Summary
bq.  -------
bq.  getRegionsInRange() will retrieve the HRegionLocations for the regions associated with
the specified key range, using client-side cache if possible.
bq.  I have one question: right now the endKey specified to getRegionsInRange() is treated
as inclusive.  I followed the behavior that I saw in HRegionInfo.containsRange().  However,
other HBase code such as Scan treats the endKey as exclusive.  So I am not clear as to which
way we should go here.  I can easily change the patch if we want the endKey to be exclusive;
please let me know.  Thanks in advance.
bq.  This addresses bug HBASE-5489.
bq.      https://issues.apache.org/jira/browse/HBASE-5489
bq.  Diffs
bq.  -----
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 29b8004 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java bdeaefe 
bq.  Diff: https://reviews.apache.org/r/4117/diff
bq.  Testing
bq.  -------
bq.  Ran the TestFromClientSide unit tests and passed repeatedly.
bq.  Ran test-patch.sh with the following results:
bq.  -1 overall.  
bq.      +1 @author.  The patch does not contain any @author tags.
bq.      +1 tests included.  The patch appears to include 3 new or modified tests.
bq.      -1 javadoc.  The javadoc tool appears to have generated -129 warning messages.
bq.      +1 javac.  The applied patch does not increase the total number of javac compiler
bq.      +1 findbugs.  The patch does not introduce any new Findbugs (version ) warnings.
bq.      +1 release audit.  The applied patch does not increase the total number of release
audit warnings.
bq.  Thanks,
bq.  David

> Add HTable accessor to get regions for a key range
> --------------------------------------------------
>                 Key: HBASE-5489
>                 URL: https://issues.apache.org/jira/browse/HBASE-5489
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: David S. Wang
>            Assignee: David S. Wang
>            Priority: Minor
>             Fix For: 0.92.1, 0.94.0
>         Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, HBASE-5489-3.patch
> It would be nice to have an accessor to find all regions that overlap with a particular
range of keys. Right now, the only way to accomplish that is to call HTable.getStartEndKeys(),
then follow that with calls to getRegionLocation() for the range of keys you are interested
in.  This algorithm has 2 drawbacks:
> * It returns more keys than is necessary most of the time.  This is especially evident
if there are a lot of regions comprising the table and the range of keys is small.
> * It always does a scan of .META. via MetaScannerVisitor for at least HTable.getStartEndKeys(),
and perhaps for HRegionLocations that are not already cached by the client.
> An accessor that limited its scans to a specified range could avoid scanning .META. at
all if the HRegionLocations being fetched were already cached by the client, thereby potentially
making this operation faster in common cases.
> Here's a proposal for the accessor:
>   /**
>    * Get the corresponding regions for an arbitrary range of keys.
>    * <p>
>    * @param startRow Starting row in range, inclusive
>    * @param endRow Ending row in range, inclusive
>    * @return A list of HRegionLocations corresponding to the regions that
>    * contain the specified range
>    * @throws IOException if a remote or network exception occurs
>    */
>   public List<HRegionLocation> getRegionsInRange(final byte [] startKey,
>     final byte [] endKey) throws IOException

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message