nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Burgess <mattyb...@apache.org>
Subject Re: Record Readers and Writers
Date Tue, 21 Apr 2020 19:29:38 GMT
Dave,

That JSON is actually the schema of the data, not the data itself.
Avro Schemas are indeed stored in JSON format, under the hood we have
utilities for changing back and forth between Avro schemas and
internal object representations of NiFi Record schemas. We don't
serialize the Record schemas to text, instead we just convert to an
Avro Schema (in JSON format) and send that along with the flow file
(if the Schema Write Strategy indicates to do so). That way other
tools that know about Avro schemas wouldn't also have to know about
some NiFi Schema text format.

Regards,
Matt



On Tue, Apr 21, 2020 at 2:26 PM DAVID SMITH
<davidrsmith@btinternet.com.invalid> wrote:
>
> Hi Matt
> Thanks for your reply, I will certainly take on board everything you and Andy advise
and I will look at classes you mentioned and I will also read the links provided.
> I ran the TestXMLReader as a junit in Eclipse, a sample of the the console output is
:
> 20:15:01.675 [pool-1-thread-1] DEBUG org.apache.nifi.schema.access.AvroSchemaTextStrategy
- For {path=target, filename=253762304418.mockFlowFile, xml.stream.is.array=true, uuid=34fb0980-8fc3-4c41-b4f5-3078d26b6f67}
found schema text {
>   "namespace": "nifi",
>   "name": "test",
>   "type": "record",
>   "fields": [
>     { "name": "ID", "type": "string" },
>     { "name": "NAME", "type": "string" },
>     { "name": "AGE", "type": "int" },
>     { "name": "COUNTRY", "type": "string" }
>   ]
> }
>
>
> Anyway, thanks again I have something to go on now.
> Dave
>    On Tuesday, 21 April 2020, 17:47:21 BST, Andy LoPresto <alopresto@apache.org>
wrote:
>
>  Hi Dave,
>
> The underlying internal “record format” is not JSON. Avro [1] is used to describe
schemas across all record formats, but the internal data storage is NiFi specific. You may
be interested in these articles by Mark Payne and Bryan Bende [2][3][4] and the potential
use of the ScriptedReader [5] or ScriptedRecordSetWriter [6] to prototype your needed conversions.
>
> [1] https://avro.apache.org/ <https://avro.apache.org/>
> [2] https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi <https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi>
> [3] https://blogs.apache.org/nifi/entry/real-time-sql-on-event <https://blogs.apache.org/nifi/entry/real-time-sql-on-event>
> [4] https://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries
<https://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries>
> [5] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-scripting-nar/1.11.4/org.apache.nifi.record.script.ScriptedReader/index.html
> [6] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-scripting-nar/1.11.4/org.apache.nifi.record.script.ScriptedRecordSetWriter/index.html
>
> Andy LoPresto
> alopresto@apache.org
> alopresto.apache@gmail.com
> He/Him
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Apr 21, 2020, at 6:01 AM, DAVID SMITH <davidrsmith@btinternet.com.INVALID>
wrote:
> >
> > Hi
> > I want to use the ConvertRecord Processor with it's underlying Record Readers and
Writers to convert files from XML or JSON to a bespoke format and probably vice versa.I have
looked at the Readers/Writers currently provided and decided that I can use the XML/JSON ones
provided but I will need to write something for the bespoke format. So I started looking at
the current source code for the Readers/Writers to see how they work and what I would need
to do. When running the unit tests on the XMLReader I notice on the console that the output
is in JSON format.My question is, is JSON the common format that all records are converted
to and from?
> > Also is there any specific documentation on writing Reader/Writers, I have only
found the developers guide?
> > Many thanksDave
> >
>

Mime
View raw message