hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anil <anilk...@gmail.com>
Subject Parallel Scanner
Date Sat, 18 Feb 2017 08:14:35 GMT
Hi ,

I am building an usecase where i have to load the hbase data into In-memory
database (IMDB). I am scanning the each region and loading data into IMDB.

i am looking at parallel scanner ( https://issues.apache.org/
jira/browse/HBASE-8504, HBASE-1935 ) to reduce the load time and HTable#
getRegionsInRange(byte[] startKey, byte[] endKey, boolean reload) is
deprecated, HBASE-1935 is still open.

I see Connection from ConnectionFactory is HConnectionImplementation by
default and creates HTable instance.

Do you see any issues in using HTable from Table instance ?
            for each region {
                        int i = 0;
                    List<HRegionLocation> regions =
hTable.getRegionsInRange(scans.getStartRow(), scans.getStopRow(), true);

                    for (HRegionLocation region : regions){
                    startRow = i == 0 ? scans.getStartRow() :
region.getRegionInfo().getStartKey();
                    i++;
                    endRow = i == regions.size()? scans.getStopRow() :
region.getRegionInfo().getEndKey();
                     }
           }

are there any alternatives to achieve parallel scan? Thanks.

Thanks

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message