hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chunhui shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9488) Improve performance for small scan
Date Tue, 10 Sep 2013 06:56:52 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762800#comment-13762800

chunhui shen commented on HBASE-9488:

bq.we instead pass one arg 'boolean shortScan'. 
In the method  HStore#getScanners, 
{format}storeFilesToScan =this.storeEngine.getStoreFileManager().getFilesForScanOrGet(isGet,
startRow, stopRow);{format}
The arg 'isGet' is used, thus need a new arg to specify whether using pread

bq.Is this caching location? Will we cache a location across changes? i.e. changes in location
for the HRegionInfo?
Sure, it use current client region cache mechanism

bq.Does this have to public +public class ClientSmallScanner extends AbstractClientScanner
Existed ClientScanner is also public, keep the same with it

bq.You should instead say that the amount of data should be small and inside the one region.
If the scan range is within one data block, it could be considered as a small scan

bq.Should the Scan check that the stoprow is inside a single region and fail if not?
Now, I hope it is controlled by user. e.g. if the scan cross multi regions, but only scan
two rows, in that case, small scan also be better.

Improve the javadoc of Scan#small in patch-V2

review board:


> Improve performance for small scan
> ----------------------------------
>                 Key: HBASE-9488
>                 URL: https://issues.apache.org/jira/browse/HBASE-9488
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, Performance, Scanners
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-9488-trunk.patch, test results.jpg
> Now, one scan operation would call 3 RPC at least:
> openScanner();
> next();
> closeScanner();
> I think we could reduce the RPC call to one for small scan to get better performance
> Also using pread is better than seek+read for small scan (For this point, see more on
> Implements such a small scan as the patch, and take the performance test as following:
> a.Environment´╝Ü
> patched on 0.94 version
> one regionserver; 
> one client with 50 concurrent threads;
> KV size:50/100;
> 100% LRU cache hit ratio;
> Random start row of scan
> b.Results:
> See the picture attachment
> *Usage:*
> Scan scan = new Scan(startRow,stopRow);
> scan.setSmall(true);
> ResultScanner scanner = table.getScanner(scan);
> Set the new 'small' attribute as true for scan, others are the same

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message