hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrian Muraru <amur...@adobe.com>
Subject Re: When split a region, how to get row keys efficiently instead of using midkey
Date Sun, 31 Jan 2016 08:41:45 GMT

>From the online docs:

When performing a table scan<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html>
where only the row keys are needed (no families, qualifiers, values or timestamps), add a
FilterList with a MUST_PASS_ALLoperator to the scanner using setFilter. The filter list should
include both a FirstKeyOnlyFilter<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html>
and a KeyOnlyFilter<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html>.
Using this filter combination will result in a worst case scenario of a RegionServer reading
a single value from disk and minimal network traffic to the client for a single row.

Sent from an iPhone

On Jan 30, 2016, at 14:00, onealbao <onealbao@gmail.com<mailto:onealbao@gmail.com>>


In default region split policy, it first finds largest stores, then finds
largest store files, and finally get split point (midkey) of the largest
store file. Is there anyway to efficiently get all row-keys of a store
files? I tried to use ResultScanner with setting start/end row key, but I
found the time consumption of scan (scan execution and read scan record) is
at least 100 times (100 ms) slower than directly get midkey (1 ms).
Actually, I just want to get all row keys in a range, and I would like to
use my own policy to group some row keys together. Since all data in my
table has similar size. Any suggestion is appreciated.

View this message in context: http://apache-hbase.679495.n3.nabble.com/When-split-a-region-how-to-get-row-keys-efficiently-instead-of-using-midkey-tp4077492.html
Sent from the HBase Developer mailing list archive at Nabble.com<http://nabble.com>.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message