hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: Hive + Hbase scanning performance
Date Mon, 10 Feb 2014 20:37:07 GMT
The block caching won't buy you much in terms of performance.
You *must* set the scanner caching.

Note that hbase.client.scanner.caching is a global config option. (see HTable.getScanner(...)),
so as long as that option is set on the Configuration that the HTable sees that Hive uses
to create the scanner it should work.

-- Lars

 From: java8964 <java8964@hotmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org> 
Sent: Monday, February 10, 2014 12:19 PM
Subject: Re: Hive + Hbase scanning performance

Hi, Ted:
Our environment is using a distribution from a Vendor, so it is not easy just to patch it
But I can seek the option to see if the vendor is willing to patch it in next release.
Before I do that, I just want to make sure patching the code is the ONLY solution.
I read the source code of Hive 0.9.0 of HiveHBaseTableInputFormat. I didn't see any place
it invoked scan.setCaching(), so I don't think "set hbase.client.scanner.caching" in the hive
session will work, but that is just my guess. There are quite a lot of messages on the internet
that it will work in this case, so it confused me.
What I want to confirm is that "set hbase.client.scanner.caching" in fact doesn't work in
hive for scan.setCaching(). Is that true?

Date: Mon, 13 Jan 2014 19:31:38 -0800
Subject: Re: Hive + Hbase scanning performance
From: yuzhihong@gmail.com
To: user@hbase.apache.org

You can patch HIVE-3603 into your deployment so that you can make use of

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message