nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sudeep mishra <sudeepshekh...@gmail.com>
Subject Re: PutDistributedMapCache
Date Thu, 14 Jan 2016 07:12:12 GMT
Upon building the repository we get different .nar files which can be
updated in the lib for my requirement.
Thanks for your help.

On Thu, Jan 14, 2016 at 9:27 AM, sudeep mishra <sudeepshekharm@gmail.com>
wrote:

> Is it possible to build the code for only a particular processor? Just
> curious if we can build and deploy a particular processor in an existing
> NiFi environment.
>
> On Wed, Jan 13, 2016 at 9:33 PM, sudeep mishra <sudeepshekharm@gmail.com>
> wrote:
>
>> Thanks Joe. I will try out the patch.
>>
>> On Wed, Jan 13, 2016 at 9:31 PM, Joe Percivall <joepercivall@yahoo.com>
>> wrote:
>>
>>> You would need to clone the nifi source from github and then apply the
>>> patch using git.
>>>
>>> Here is how to clone a repo:
>>> https://help.github.com/articles/cloning-a-repository/
>>> Along with the nifi repo itself: https://github.com/apache/nifi
>>>
>>> and how to apply a patch:
>>> http://makandracards.com/makandra/2521-git-how-to-create-and-apply-patches
>>>
>>> Let me know if you have any other questions,
>>> Joe
>>> - - - - - -
>>> Joseph Percivall
>>> linkedin.com/in/Percivall
>>> e: joepercivall@yahoo.com
>>>
>>>
>>>
>>> On Wednesday, January 13, 2016 10:56 AM, sudeep mishra <
>>> sudeepshekharm@gmail.com> wrote:
>>>
>>>
>>>
>>> Thank you very much Joe.
>>>
>>> Can you please let me know how I can use the .patch file? I am using the
>>> NiFi via the binaries... Do I need to setup the source code and build the
>>> same along with the patch?
>>>
>>> Thanks & Regards,
>>>
>>> Sudeep
>>>
>>>
>>> On Wed, Jan 13, 2016 at 9:02 PM, Joe Percivall <joepercivall@yahoo.com>
>>> wrote:
>>>
>>> Hello Sudeep,
>>> >
>>> >I put up a patch on the GetDistributedMapCache ticket[1]. Let me know
>>> what you think.
>>> >
>>> >The PutDistributedMapCache processor and GetDistributedMapCache work
>>> with the data as a byte[] so it should be format agnostic. That being said
>>> it will be up to you to know what is in there in order to use it later.
>>> >
>>> >[1] https://issues.apache.org/jira/browse/NIFI-1382
>>> >
>>> >Joe
>>> >- - - - - -
>>> >Joseph Percivall
>>> >linkedin.com/in/Percivall
>>> >e: joepercivall@yahoo.com
>>> >
>>> >
>>> >
>>> >
>>> >On Tuesday, January 12, 2016 11:34 PM, sudeep mishra <
>>> sudeepshekharm@gmail.com> wrote:
>>> >
>>> >
>>> >
>>> >Thanks Joe.
>>> >
>>> >I do not have specific configuration as of now as I am still exploring
>>> NiFi. Though I think it would be helpful to let user store and retrieve the
>>> cache values in different formats json, avro etc.
>>> >
>>> >Thanks & Regards,
>>> >
>>> >Sudeep
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >On Tue, Jan 12, 2016 at 9:15 PM, Joe Percivall <joepercivall@yahoo.com>
>>> wrote:
>>> >
>>> >Hello Sudeep,
>>> >>
>>> >>
>>> >>We are currently lacking a "GetDistributedMapCache" processor that
>>> corresponds to the "PutDistributedMapCache". I created a ticket[1] and will
>>> be working on it today. If you have any comments, configuration
>>> suggestions, etc. please let me know or comment on the ticket.
>>> >>
>>> >>
>>> >>[1] https://issues.apache.org/jira/browse/NIFI-1382
>>> >>
>>> >>Joe
>>> >>- - - - - -
>>> >>Joseph Percivall
>>> >>linkedin.com/in/Percivall
>>> >>e: joepercivall@yahoo.com
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>On Tuesday, January 12, 2016 9:46 AM, sudeep mishra <
>>> sudeepshekharm@gmail.com> wrote:
>>> >>
>>> >>
>>> >>
>>> >>Thanks Matt.
>>> >>
>>> >>
>>> >>In my data flow I am expected to perform certain validations on data.
>>> I am loading some SQLServer data into HDFSusing Sqoop (not part of NiFi
>>> flow). For each record in HDFS file I have to query another database and
>>> then save the validated record again in HDFS which will be processed bysome
>>> Spark jobs.
>>> >>
>>> >>
>>> >>Since I have to query for each record thus I was planning to cache the
>>> database records against which I have to validate the HDFS. Thus I was
>>> evaluating the DistributedCacheServer. But looks like its purpose is
>>> different. Alternatively can we integrate Redis or another distributed
>>> cache with NiFi as I do not see any processor for it.
>>> >>
>>> >>
>>> >>Appreciate your help.
>>> >>
>>> >>
>>> >>Thanks & Regards,
>>> >>
>>> >>
>>> >>Sudeep
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>On Tue, Jan 12, 2016 at 6:59 PM, Matthew Clarke <
>>> matt.clarke.138@gmail.com> wrote:
>>> >>
>>> >>Sudeep,
>>> >>>       I was a little off on my second scenario.  The detectduplicate
>>> processor uses the distributedcache service all on its own.. Files that are
>>> route through it are loaded into the cache if they do not already exist in
>>> the cache.  if they do already exist they are routed to duplicate.  The
>>> putDistributedCache processor was a community contribution to which there
>>> are no processor that make use of the info that it caches.
>>> >>>
>>> >>>       We should probably build a processor that would make use of
>>> the data that can be loaded by the putDistributeCache processor.  Is there
>>> a particular use case you are trying to solve where this would be
>>> applicable?
>>> >>>
>>> >>>
>>> >>>Thanks,
>>> >>>Matt
>>> >>>
>>> >>>
>>> >>>On Tue, Jan 12, 2016 at 8:11 AM, Matthew Clarke <
>>> matt.clarke.138@gmail.com> wrote:
>>> >>>
>>> >>>Sudeep,
>>> >>>>    The DistributedMapCache is typically used to prevent the
>>> consumption of duplicate data by some of the ingest type processors
>>> (GetHBASE, ListHDFS, and ListSFTP).  NiFi uses the service to keep a
>>> listing of what has been consumed so the same files are not consumed
>>> multiple times. The Service can also be used to detect if duplicate data
>>> already exists within a NiFi Instance or cluster. This would be the
>>> scenario where some source is pushing data to your NiFi and perhaps they
>>> push the same data more than once. You want to catch these duplicates so
>>> you can perhaps kick them out of your flow. For this you would use the
>>> PutDistributedCache processor to cache all incoming data and then use the
>>> DetectDuplicate processor to find those duplicates.
>>> >>>>
>>> >>>>    Was there a different use case you were looking to solve
using
>>> the Distributed cache service?
>>> >>>>
>>> >>>>
>>> >>>>Thanks,
>>> >>>>Matt
>>> >>>>
>>> >>>>
>>> >>>>On Tue, Jan 12, 2016 at 4:36 AM, sudeep mishra <
>>> sudeepshekharm@gmail.com> wrote:
>>> >>>>
>>> >>>>Hi,
>>> >>>>>
>>> >>>>>
>>> >>>>>I can cache some data to be used in NiFi flow. I can see
the
>>> processor PutDistributedMapCache in the documentation which saves key-value
>>> pairs in DistributedMapCache for NiFi but I do not see any processor to red
>>> this data. How can I read data from DistributedMapCache in my data flow?
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>>Thanks & Regards,
>>> >>>>>
>>> >>>>>
>>> >>>>>Sudeep Shekhar Mishra
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >>--
>>> >>
>>> >>Thanks & Regards,
>>> >>
>>> >>
>>> >>Sudeep Shekhar Mishra
>>> >>
>>> >>
>>> >>+91-9167519029
>>> >>sudeepshekharm@gmail.com
>>> >>
>>> >>
>>> >
>>> >
>>> >--
>>> >
>>> >Thanks & Regards,
>>> >
>>> >Sudeep Shekhar Mishra
>>> >
>>> >+91-9167519029
>>> >sudeepshekharm@gmail.com
>>> >
>>>
>>>
>>> --
>>>
>>> Thanks & Regards,
>>>
>>> Sudeep Shekhar Mishra
>>>
>>> +91-9167519029
>>> sudeepshekharm@gmail.com
>>>
>>
>>
>>
>> --
>> Thanks & Regards,
>>
>> Sudeep Shekhar Mishra
>>
>> +91-9167519029
>> sudeepshekharm@gmail.com
>>
>
>
>
> --
> Thanks & Regards,
>
> Sudeep Shekhar Mishra
>
> +91-9167519029
> sudeepshekharm@gmail.com
>



-- 
Thanks & Regards,

Sudeep Shekhar Mishra

+91-9167519029
sudeepshekharm@gmail.com

Mime
View raw message