hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gokul Balakrishnan <royal...@gmail.com>
Subject Dealing with data locality in the HBase Java API
Date Wed, 04 Mar 2015 05:46:06 GMT

I'm fairly new to HBase so would be grateful for any assistance.

My project is as follows: use HBase as an underlying data store for an
analytics cluster (powered by Apache Spark).

In doing this, I'm wondering how I may set about leveraging the locality of
the HBase data during processing (in other words, if the Spark instance is
running on a node that also houses HBase data, how to make use of the local
data first).

Is there some form of metadata offered by the Java API which I could then
use to organise the data into (virtual) groups based on the locality to be
passed forward to Spark? It could be something that *identifies on which
node a particular row resides*. I found [1] but I'm not sure if this is
what I'm looking for. Could someone please point me in the right direction?

[1] https://issues.apache.org/jira/browse/HBASE-12361

Thanks so much!
Gokul Balakrishnan.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message