hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 罗辉 <luo...@ifeng.com>
Subject question about SparkSQL loading hbase tables
Date Wed, 29 Jun 2016 02:13:56 GMT
Hi there
     I am using SparkSQL to read from hbase, however

1.       I find some API not available in my dependencies. Where to add them:

org.apache.hadoop.hbase.spark.example.hbasecontext

org.apache.spark.sql.datasources.hbase.HBaseTableCatalog

org.apache.hadoop.hbase.spark.datasources.HBaseSparkConf

2.       Is there a complete example code about how to use SparkSQL read/write from hbase?

The document I refered is this: http://hbase.apache.org/book.html#_sparksql_dataframes. It
seems that this is a snapshot for 2.0, while I am using hbase 1.2.1 + spark1.6.1 + hadoop2.7.1.



In my App, I want to load the entire hbase table into sparksql
My code:

import org.apache.spark._
import org.apache.hadoop.hbase._
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.spark.example.hbasecontext
import org.apache.spark.sql.datasources.hbase.HBaseTableCatalog
import org.apache.hadoop.hbase.spark.datasources.HBaseSparkConf

object HbaseConnector {
  def main(args: Array[String]) {
    val tableName = args(0)
    val sparkMasterUrlDev = "spark:// hadoopmaster:7077"
    val sparkMasterUrlLocal = "local[2]"

    val sparkConf = new SparkConf().setAppName("HbaseConnector for table " + tableName).setMaster(sparkMasterUrlDev).set("spark.executor.memory",
"10g")
    val sc = new SparkContext(sparkConf)
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    val conf = new HBaseConfiguration()
    conf.set("hbase.zookeeper.quorum", "z1,z2,z3")
    conf.set("hbase.zookeeper.property.clientPort", "2181")
    conf.set("hbase.rootdir", "hdfs://hadoopmaster:8020/hbase")
    //    val hbaseContext = new HBaseContext(sc, conf)

    val pv = sqlContext.read.options(Map(HBaseTableCatalog.tableCatalog -> writeCatalog,
HBaseSparkConf.TIMESTAMP -> tsSpecified.toString))
      .format("org.apache.hadoop.hbase.spark")
      .load()
    pv.write.saveAsTable(tableName)

  }

}

My POM file is attached as well.

Thanks for a help.

San.Luo
Mime
View raw message