From dev-return-18525-archive-asf-public=cust-asf.ponee.io@nifi.apache.org Fri Nov 30 20:47:51 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 7D48D180671 for ; Fri, 30 Nov 2018 20:47:50 +0100 (CET) Received: (qmail 61204 invoked by uid 500); 30 Nov 2018 19:47:49 -0000 Mailing-List: contact dev-help@nifi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@nifi.apache.org Delivered-To: mailing list dev@nifi.apache.org Received: (qmail 61184 invoked by uid 99); 30 Nov 2018 19:47:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Nov 2018 19:47:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 3F54F183152 for ; Fri, 30 Nov 2018 19:47:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.202 X-Spam-Level: X-Spam-Status: No, score=-0.202 tagged_above=-999 required=6.31 tests=[DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=btinternet.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id N71TKHzCNl3Z for ; Fri, 30 Nov 2018 19:47:42 +0000 (UTC) Received: from sonic302-24.consmr.mail.ir2.yahoo.com (sonic302-24.consmr.mail.ir2.yahoo.com [87.248.110.87]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id E688F610A0 for ; Fri, 30 Nov 2018 19:47:40 +0000 (UTC) X-YMail-OSG: ZuF7.NMVM1nbX2RL7ZQoBEg2AjIHWvwjraJe..0qi7OkmaLK33fK4Fi1HViufFE NjJkDqGIVVw8ZiSa1qBtBu3N77WS9tqnPb.IR8nooCEynAeC1HMT.EsM73ulmotqfRWiFiK0pA6A JPS1fI.Cr3FGZK429wvTsDebehSh8_SBabES737mpLuRGqHEA3faKpPoiqMy9DyXUyg85ywUzRxC DTvEOErdHN8kmdfBIyjfVsMpewJ6QxnjEnlxbyZBXx5cEuNXqR9IKKHXnSIAXen7gc4X0rXp5iwi V8MpV0ihG4f9M2bSapTclUrAK4W5lGUOwbx9jbuwVfz6TDdXzfENFogQ9ukSuuw5q0I7dm3S6Lq0 9KEWhOefAsTygj4CbV2iA2qPOur_MqOyhR2SdQMJzNuefhAwbRPvE2MeNvT9tjLUM6HByCygap7z .ushZ28kezAkl4gJi5.2auXlZrAjj4NIF0_xe0XqbIu8YdEzP_LIF4oRJV5wu2jsZRQxwIAvh0G8 tlEuuMASF2VcqMZtOjh_CrEZ2lS63pKuXA07i4f7fdnCUox6zQtKlyAAwCSOStDA.UdSWtox5MCt Au09HXv2SiHFf7ZS78LuQWiodyUbsNy5QIQ7f6qh1fxwEmgzg97vjvd1U0.MSJA1UtVLRIxZQ42C Bx4LXWea_cp.h_dhHh6XjEMbjd2UKbv6ey67qFEfELuJvzUCEt4p9ZqzAN7VKMeoFvBqpM0eJ4sN 5MqQjPEWiUnKgJxRZg9ewvAaMBRBeHCjaSKDrse1few5jmkuPJ6YNKfHfpX0vdI4vF55fdUlqqky KpOtg8wbeLHEv84.guRv0vmKv6bawmK.7fsqwz1f_zLMeqIU99LEcyHKfqEVTTBbT3B8LtWxdLEI MtAtZ98XQ2SOpkW1ZfbUBiqHMqGP1d15KkPxTGzEYk6exdK7IZSaU2XmkWJ6z3ngUO1Awh.QRPtR EKNSgznGrC_9lAM0FzGPCYFY.BN9Nh6A2XYiyDLU2y3YTFluSlTHHAdiQnpDOSMCq Received: from sonic.gate.mail.ne1.yahoo.com by sonic302.consmr.mail.ir2.yahoo.com with HTTP; Fri, 30 Nov 2018 19:47:34 +0000 Date: Fri, 30 Nov 2018 19:47:32 +0000 (UTC) From: DAVID SMITH Reply-To: DAVID SMITH To: Message-ID: <2122562535.824798.1543607252487@mail.yahoo.com> Subject: Re: Help with loading a file into a cache MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable References: <2122562535.824798.1543607252487.ref@mail.yahoo.com> X-Mailer: WebService/1.1.12827 YahooMailBasic Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0 Hi=20 As requested here is an example file with some redacted data: ZA105:{"Aircraft Type":"Sea King", "Lifed Items":{ "port engine ser#":"RR-P= 1234", "starboard engine ser#":"RR-S1234","gearboxes ser#":[ "WHM1234", "WH= I1234", "WHT1234" ] }} ZA106:{"Aircraft Type":"Sea King", "Lifed Items":{ "port engine ser#":"RR-P= 2345", "starboard engine ser#":"RR-S2345","gearboxes ser#":[ "WHM2345", "WH= I2345", "WHT2345" ] }} ZA107:{"Aircraft Type":"Merlin", "Lifed Items":{ "port engine ser#":"RR-P34= 56", "starboard engine ser#":"RR-S3456","centre engine ser#":"RR-C3456","ge= arboxes ser#":[ "WHM3456", "WHI3456", "WHT3456" ] }} ZA108:{"Aircraft Type":"Merlin", "Lifed Items":{ "port engine ser#":"RR-P45= 67", "starboard engine ser#":"RR-S4567","centre engine ser#":"RR-C4567","ge= arboxes ser#":[ "WHM4567", "WHI4567", "WHT4567" ] }} ZA109:{"Aircraft Type":"Wessex", "Lifed Items":{ "port engine":"RR-P9876", = "starboard engine":"RR-S9876","gearboxes":[ "WHM9876", "WHI9876", "WHT9876"= ] }} ZA104:{"Aircraft Type":"Wessex", "Lifed Items":{ "port engine":"RR-P8765", = "starboard engine":"RR-S8765","gearboxes":[ "WHM8765", "WHI8765", "WHT8765"= ] }} ZA103:{"Aircraft Type":"Wessex", "Lifed Items":{ "port engine":"RR-P7654", = "starboard engine":"RR-S7654","gearboxes":[ "WHM7654", "WHI7654", "WHT7654"= ] }} What I would like is the aircraft tail no eg ZA104 to be the key of the cac= he item and everything after the colon (the aircraft type and replaceables = serial numbers to be the cached item value. The cached item value can stay = as a JSON string. Many thanks Dave -------------------------------------------- On Fri, 30/11/18, Mike Thomsen wrote: Subject: Re: Help with loading a file into a cache To: dev@nifi.apache.org Date: Friday, 30 November, 2018, 15:26 =20 Dave, =20 Can you post a redacted example with dummy data? =20 Thanks, =20 Mike =20 On Fri, Nov 30, 2018 at 7:08 AM DAVID SMITH wrote: =20 > Hi Devs > I am running a NiFi 1.8 cluster, each node has 128Gb of Ram. I need to > load the contents of a file of which is around 5Gb in size=C2=A0 into a > Key/Value cache. > The file I want to load is produced by another company so the format it > comes in is not negotiable. The file contains thousands of lines in the > following format:- > :{: , name>:}:{: value>, :} > :{: , name>:} > > I want the index value to become the Key and everything=C2=A0 beyond the colon > to become the value. > What would be the most efficient way of reading the file, and parsing it > to load into a cache, I thought of reading in the file, using a split > content on CR/LF and then splitting on the first colon.I have noticed in > 1.8 there are some CSV and JSON Readers (controller services), would these > be a better way of doing this, but the problem I can see is that the file > isn't quite a CSV and it isn't quite a JSON Array file. > Many thanksDave =20