nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lee Laim <lee.l...@gmail.com>
Subject Re: Question with ExtractText Processor
Date Wed, 12 Jul 2017 20:02:53 GMT
Atish,

ExtractText has a setting called Max Buffer Size that defaults to 1 MB, but
I don't think this is causing your queue to build up before Extract Text.
While I don't know the exact details of your flow, I suggest you try Split
Text instead of Split Content, as this is a more commonly used pattern.

An example template is available on the NiFi dev page:
https://cwiki.apache.org/confluence/display/NIFI/Example+Dataflow+Templates
CsvToJSON.xml
<https://cwiki.apache.org/confluence/download/attachments/57904847/CsvToJSON.xml?version=2&modificationDate=1486479474000&api=v2>
This
flow shows how to convert a CSV entry to a JSON document using ExtractText
and ReplaceText.


If you upgrade to 1.3, you can use the NiFi schema registry to convert
formats.
Here is a great write up:
http://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries


Thanks,
Lee



On Wed, Jul 12, 2017 at 11:24 AM, Atish Ray <atray@lexmark.com> wrote:

> Thanks!!! Regex is working for me with smaller number of column. Another
> problem I am facing with ExtractText processor. My pipe delimited file
> having 34 fields. I need to convert all 34 fields and convert them into
> json. My file size is around 30MB. So I am converting from CSV to JSON
> using
> "SplitContent">ExtractText>ReplaceText. Queue is stuck before ExtractText.
> Do we have any limitation on number of extracted column?
>
>
>
>
> --
> View this message in context: http://apache-nifi-developer-
> list.39713.n7.nabble.com/Question-with-ExtractText-
> Processor-tp16405p16412.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message