hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yi Liang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17933) [hbase-spark] Support Java api for bulkload
Date Wed, 19 Apr 2017 17:49:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975161#comment-15975161
] 

Yi Liang commented on HBASE-17933:
----------------------------------

Hi Sean, 
  Thanks for review, for your second comment, when I start designing the API, I also want
to use the approach you mentioned above. But since it is a java api, we have some limitation
in Java language. From its scala API
{code}
  def bulkLoad[T](rdd:RDD[T],
                  tableName: TableName,
                  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])],
                  stagingDir:String,
                  familyHFileWriteOptionsMap:
                  util.Map[Array[Byte], FamilyHFileWriteOptions] =
                  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
                  compactionExclude: Boolean = false,
                  maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE)
{code}

As you can see,  the scala API allow callers to provide a method can transform any type [T]
to a key-value pair (KeyFamilyQualifier, Array[Byte]).  But in Java, a method(actually a class
that implement mapFunction class) can not directly transfer T to a kv pair like scala, since
Java does not support (k,v) pair I need to transfer it to a  defined type, and in scala transfer
this type to the kv pair.

> [hbase-spark]  Support Java api for bulkload
> --------------------------------------------
>
>                 Key: HBASE-17933
>                 URL: https://issues.apache.org/jira/browse/HBASE-17933
>             Project: HBase
>          Issue Type: New Feature
>          Components: spark
>    Affects Versions: 2.0.0
>            Reporter: Yi Liang
>            Assignee: Yi Liang
>             Fix For: 2.0.0
>
>         Attachments: HBase-17933-V1.patch
>
>
> In JavaHBaseContext, there are java api for bulkPut, bulkDelete ...., but no Java api
for bulkload. And this jira will add bulkload java api to hbase-spark



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message