spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tathagata Das (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-2447) Add common solution for sending upsert actions to HBase (put, deletes, and increment)
Date Thu, 31 Jul 2014 21:41:38 GMT

    [ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081524#comment-14081524
] 

Tathagata Das commented on SPARK-2447:
--------------------------------------

Exactly!! That's why I feel that both have its merits, 2447 provides lower-level, all-inclusive
interfaces using which slightly advanced users can do arbitrary stuff with. But it requires
programming against HBase types like Put, and all. However, 1127 provides the simple interface
which allows not-so-advanced users to do a set of simple stuff without requiring too much
HBase knowledge. They are complimentary, and the latter should be implemented on top of the
former. 



> Add common solution for sending upsert actions to HBase (put, deletes, and increment)
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-2447
>                 URL: https://issues.apache.org/jira/browse/SPARK-2447
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core, Streaming
>            Reporter: Ted Malaska
>            Assignee: Ted Malaska
>
> Going to review the design with Tdas today.  
> But first thoughts is to have an extension of VoidFunction that handles the connection
to HBase and allows for options such as turning auto flush off for higher through put.
> Need to answer the following questions first.
> - Can it be written in Java or should it be written in Scala?
> - What is the best way to add the HBase dependency? (will review how Flume does this
as the first option)
> - What is the best way to do testing? (will review how Flume does this as the first option)
> - How to support python? (python may be a different Jira it is unknown at this time)
> Goals:
> - Simple to use
> - Stable
> - Supports high load
> - Documented (May be in a separate Jira need to ask Tdas)
> - Supports Java, Scala, and hopefully Python
> - Supports Streaming and normal Spark



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message