nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Witt <joe.w...@gmail.com>
Subject Re: Partitioning from actual Data (FlowFile) in NiFi
Date Thu, 11 May 2017 13:53:00 GMT
Anshuman

Hello.  Please avoid directly addressing specific developers and
instead just address the mailing list you need (dev or user).

If your data is CSV, for example, you can use RouteText to efficiently
partition the incoming sets by matching field/column values and in so
doing you'll now have the flowfile attribute you need for that group.
Then you can merge those together with MergeContent for like
attributes and when writing to HDFS you can use that value.

With the next record reader/writer capabilities in Apache NiFI 1.2.0
we can now provide a record oriented PartitionRecord processor which
will then also let you easily do this pattern on all kinds of
formats/schemas in a nice/clean way.

Joe

On Thu, May 11, 2017 at 9:49 AM, Anshuman Ghosh
<anshuman.ghosh2009@gmail.com> wrote:
> Hello everyone,
>
> It would be great if you can help me implementing this use-case
>
> Is there any way (NiFi processor) to use an attribute (field/ column) value
> for partitioning when writing the final FlowFile to HDFS/ other storage.
> Earlier we were using simple system date
> (/year=${now():format('yyyy')}/month=${now():format('MM')}/day=${now():format('dd')}/)
> for this but that doesn't make sense when we consume old data from Kafka and
> want to partition on original date (a date field inside Kafka message)
>
>
> Thank you!
> ______________________
>
> Kind Regards,
> Anshuman Ghosh
> Contact - +49 179 9090964
>

Mime
View raw message