hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Kwan <thomas.k...@manage.com>
Subject random reads
Date Thu, 14 Aug 2014 17:32:09 GMT
Hi there

I have a use-case where I need to do a read to check if a hbase entry
is present, then I do a put to create the entry when it is not there.

I have a script to get a list of rowkeys from hive and put them on a
HDFS directory. Then I have a MR job that reads the rowkeys and do
batch reads. I am getting around 1.5K requests per second.

To attempt to make this faster, I am wondering if I can

- sort and group the rowkeys based on regions
- make the MR jobs run on regions that have the data locally

Scan or TableInputFormat must have some codes to do something similar right?

thanks
thomas

Mime
View raw message