nifi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Bende (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-919) Support Splitting Avro Files
Date Wed, 02 Sep 2015 13:57:47 GMT

    [ https://issues.apache.org/jira/browse/NIFI-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727380#comment-14727380
] 

Bryan Bende commented on NIFI-919:
----------------------------------

Sean, I captured the bare record idea based on [~rdblue] 's feedback on the mailing list discussion:
https://www.mail-archive.com/dev%40nifi.apache.org/msg00291.html

As I've been thinking about it more, I'm not really sure if there is a specific use-case that
the bare record helps with. 

The main scenario I had it mind was to take a binary Avro datafile with N records, and split
it into N datafiles, so that there could be a follow on processor like "EvaluateAvroPath"
that could extract a value from one of the single record datafiles, and then downstream processors
could make decisions based on the extracted values. Basically the same type of stuff we can
do with SplitJson -> EvaluateJsonPath -> RouteOnAttribute or SplitXML -> EvaluateXPath
-> RouteOnAttribute.

> Support Splitting Avro Files
> ----------------------------
>
>                 Key: NIFI-919
>                 URL: https://issues.apache.org/jira/browse/NIFI-919
>             Project: Apache NiFi
>          Issue Type: New Feature
>            Reporter: Bryan Bende
>            Assignee: Bryan Bende
>            Priority: Minor
>             Fix For: 0.4.0
>
>
> Provide a processor that splits an Avro file into multiple smaller files. Would be nice
to have a configurable batch size so a user could produce single record files and also multi-record
files of smaller size than the original. Also consider making the output format configurable,
data file vs bare record.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message