nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DAVID SMITH <davidrsm...@btinternet.com.INVALID>
Subject Re: Help with loading a file into a cache
Date Fri, 30 Nov 2018 19:47:32 GMT
Hi 

As requested here is an example file with some redacted data:

ZA105:{"Aircraft Type":"Sea King", "Lifed Items":{ "port engine ser#":"RR-P1234", "starboard
engine ser#":"RR-S1234","gearboxes ser#":[ "WHM1234", "WHI1234", "WHT1234" ] }}
ZA106:{"Aircraft Type":"Sea King", "Lifed Items":{ "port engine ser#":"RR-P2345", "starboard
engine ser#":"RR-S2345","gearboxes ser#":[ "WHM2345", "WHI2345", "WHT2345" ] }}
ZA107:{"Aircraft Type":"Merlin", "Lifed Items":{ "port engine ser#":"RR-P3456", "starboard
engine ser#":"RR-S3456","centre engine ser#":"RR-C3456","gearboxes ser#":[ "WHM3456", "WHI3456",
"WHT3456" ] }}
ZA108:{"Aircraft Type":"Merlin", "Lifed Items":{ "port engine ser#":"RR-P4567", "starboard
engine ser#":"RR-S4567","centre engine ser#":"RR-C4567","gearboxes ser#":[ "WHM4567", "WHI4567",
"WHT4567" ] }}
ZA109:{"Aircraft Type":"Wessex", "Lifed Items":{ "port engine":"RR-P9876", "starboard engine":"RR-S9876","gearboxes":[
"WHM9876", "WHI9876", "WHT9876" ] }}
ZA104:{"Aircraft Type":"Wessex", "Lifed Items":{ "port engine":"RR-P8765", "starboard engine":"RR-S8765","gearboxes":[
"WHM8765", "WHI8765", "WHT8765" ] }}
ZA103:{"Aircraft Type":"Wessex", "Lifed Items":{ "port engine":"RR-P7654", "starboard engine":"RR-S7654","gearboxes":[
"WHM7654", "WHI7654", "WHT7654" ] }}



What I would like is the aircraft tail no eg ZA104 to be the key of the cache item and everything
after the colon (the aircraft type and replaceables serial numbers to be the cached item value.
The cached item value can stay as a JSON string.


Many thanks

Dave
--------------------------------------------
On Fri, 30/11/18, Mike Thomsen <mikerthomsen@gmail.com> wrote:

 Subject: Re: Help with loading a file into a cache
 To: dev@nifi.apache.org
 Date: Friday, 30 November, 2018, 15:26
 
 Dave,
 
 Can you post a redacted example with dummy
 data?
 
 Thanks,
 
 Mike
 
 On
 Fri, Nov 30, 2018 at 7:08 AM DAVID SMITH
 <davidrsmith@btinternet.com.invalid>
 wrote:
 
 > Hi Devs
 > I am running a NiFi 1.8 cluster, each node
 has 128Gb of Ram. I need to
 > load the
 contents of a file of which is around 5Gb in size  into
 a
 > Key/Value cache.
 >
 The file I want to load is produced by another company so
 the format it
 > comes in is not
 negotiable. The file contains thousands of lines in the
 > following format:-
 >
 <index value1>:{<property1 name>: <property1
 value>, <property2
 >
 name>:<property2 value>}<index
 value2>:{<property1 name>: <property1
 > value>, <property2
 name>:<property2 value>}
 >
 <index value3>:{<property1 name>: <property1
 value>, <property2
 >
 name>:<property2 value>}
 >
 > I want the index value to become the Key
 and everything  beyond the colon
 > to
 become the value.
 > What would be the
 most efficient way of reading the file, and parsing it
 > to load into a cache, I thought of reading
 in the file, using a split
 > content on
 CR/LF and then splitting on the first colon.I have noticed
 in
 > 1.8 there are some CSV and JSON
 Readers (controller services), would these
 > be a better way of doing this, but the
 problem I can see is that the file
 >
 isn't quite a CSV and it isn't quite a JSON Array
 file.
 > Many thanksDave
 

Mime
View raw message