nifi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Jackoway (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-817) Create Processors to interact with HBase
Date Sat, 12 Sep 2015 15:19:45 GMT

    [ https://issues.apache.org/jira/browse/NIFI-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742086#comment-14742086
] 

Alan Jackoway commented on NIFI-817:
------------------------------------

Is this the right place to comment on this? I couldn't find a proposal on the wiki or a PR.

I got stuck on a plane with no WiFi but this patch open so I have a number of thoughts / questions.
In general, it's not clear to me what use cases this is designed for and how it interacts
with the existing Kite HBase capabilities. I would like to see the user stories / description
of what we're trying to support here and why.
The GetHBase processor does something I don't think I've seen before in NiFi. It looks to
me like it's designed to do incremental scans of records that changed since the last scan.
That's interesting, but I'm not sure how it works in NiFi. If the processor goes down, or
work gets scheduled on a different node, will the state that holds the last scan time be available?
It seems to me like you would need some kind of state somewhere outside of NiFi's memory to
have this work if the goal is to process each record once when it gets updated.
Always getting the whole row could be an issue for some users. I would love to see an optional
parameter to list the columns (and column families, perhaps) to restrict what gets returned.
This could work similar to how COLUMNS => works in the HBase shell.
The PutHBaseCell processor writing only to a single cell is interesting. I guess the plan
would be for users that need to spread data across multiple cells use kite while users who
want to write content to a single cell use PutHBaseCell. My main concern with PutHBaseCell
would be how many flow files it will need flowing into it. If I understand correctly, you
would need each row to be in a separate flow file so that you can set the row attribute on
it.


> Create Processors to interact with HBase
> ----------------------------------------
>
>                 Key: NIFI-817
>                 URL: https://issues.apache.org/jira/browse/NIFI-817
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>             Fix For: 0.4.0
>
>         Attachments: 0001-NIFI-817-Initial-implementation-of-HBase-processors.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message