hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Malaska (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14795) Enhance the spark-hbase scan operations
Date Thu, 10 Dec 2015 17:03:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051244#comment-15051244
] 

Ted Malaska commented on HBASE-14795:
-------------------------------------

Hey Zhan

I left one comment about the sync block.  and I do see that you added a bunch of try catch
blocks.  But the problem still remains where the table and scanner can be un closed.

I think we need to add something like the following: 

https://github.com/apache/spark/blob/f434f36d508eb4dcade70871611fc022ae0feb56/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala#L154

You will note this is given for free if you use a InputFormat.  Which asks the question should
these changes go back into the TableInputFormat and we just use the TableInputFormat.  This
would allow us to maintain reading from table in one location and it would also mean you don't
have to worry about the life cycle of anything.

> Enhance the spark-hbase scan operations
> ---------------------------------------
>
>                 Key: HBASE-14795
>                 URL: https://issues.apache.org/jira/browse/HBASE-14795
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Ted Malaska
>            Assignee: Zhan Zhang
>            Priority: Minor
>         Attachments: 0001-HBASE-14795-Enhance-the-spark-hbase-scan-operations.patch,
HBASE-14795-1.patch, HBASE-14795-2.patch, HBASE-14795-3.patch
>
>
> This is a sub-jira of HBASE-14789.  This jira is to focus on the replacement of TableInputFormat
for a more custom scan implementation that will make the following use case more effective.
> Use case:
> In the case you have multiple scan ranges on a single table with in a single query. 
TableInputFormat will scan the the outer range of the scan start and end range where this
implementation can be more pointed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message