hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15223) Make convertScanToString public for Spark
Date Sat, 06 Feb 2016 20:25:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135991#comment-15135991
] 

Jerry He commented on HBASE-15223:
----------------------------------

The main thing in the patch is to change convertScanToString  and convertStringToScan in TableMapReduceUtil
to public so that they can be used by external users.
Users don't need to be concerned by the internal of the conversion.
The other part of the patch is just to use the Scan JSON in the toString() instead of the
the encoded string.

> Make convertScanToString public for Spark
> -----------------------------------------
>
>                 Key: HBASE-15223
>                 URL: https://issues.apache.org/jira/browse/HBASE-15223
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Jerry He
>            Assignee: Jerry He
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: HBASE-15223-master.patch
>
>
> One way to access HBase from Spark is to use newAPIHadoopRDD, which can take a TableInputFormat
as class name.  But we are not able to set a Scan object in there, for example to set a HBase
filter.
> In MR,  the public API TableMapReduceUtil.initTableMapperJob() or equivalent is used
which can take a Scan object.  But this call is not used in Spark conveniently. 
> We need to make the TableMapReduceUtil.convertScanToString() public.
> So that a Scan object can be created, populated and then convert to the property and
used by Spark.  They are now package private.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message