hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhan Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14801) Enhance the Spark-HBase connector catalog with json format
Date Tue, 01 Mar 2016 20:39:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15174376#comment-15174376
] 

Zhan Zhang commented on HBASE-14801:
------------------------------------

The purpose of this patch is to change the hbase catalog definition to be json based. With
the change, it is more formalized,  less error prone and easy to extend for future feature
support, for example support write, customerized serdes, avro support, etc.

For example, following is the new format for hbase catalog
  def writeCatalog = s"""{
                    |"table":{"namespace":"default", "name":"table1"},
                    |"rowkey":"key",
                    |"columns":{
                    |"col0":{"cf":"rowkey", "col":"key", "type":"string"},
                    |"col1":{"cf":"cf1", "col":"col1", "type":"string"},
                    |"col2":{"cf":"cf2", "col":"col2", "type":"double"},
                    |"col3":{"cf":"cf3", "col":"col3", "type":"float"},
                    |"col4":{"cf":"cf4", "col":"col4", "type":"int"},
                    |"col5":{"cf":"cf5", "col":"col5", "type":"bigint"}}
                    |}
                    |}""".stripMargin

Read:
  def withCatalog(cat: String): DataFrame = {
    sqlContext
      .read
      .options(Map(HBaseTableCatalog.tableCatalog->cat))
      .format("org.apache.hadoop.hbase.spark")
      .load()
  }
val df = withCatalog(writeCatalog)

Write:
    sc.parallelize(data).toDF.write.options(
      Map(HBaseTableCatalog.tableCatalog -> writeCatalog, HBaseTableCatalog.newTable ->
"5"))
      .format("org.apache.hadoop.hbase.spark")
      .save()


> Enhance the Spark-HBase connector catalog with json format
> ----------------------------------------------------------
>
>                 Key: HBASE-14801
>                 URL: https://issues.apache.org/jira/browse/HBASE-14801
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Zhan Zhang
>            Assignee: Zhan Zhang
>         Attachments: HBASE-14801-1.patch, HBASE-14801-2.patch, HBASE-14801-3.patch, HBASE-14801-4.patch,
HBASE-14801-5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message