nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Percivall <joeperciv...@yahoo.com>
Subject Re: PutDistributedMapCache
Date Thu, 14 Jan 2016 15:56:36 GMT
Hello Sudeep,
Sorry, not following your emails, did you need more help importing the processor?
Currently the way you would clear a DistributedMapCache is to just remove the DistributedMapCacheServer
controller service and make a new one.
Joe - - - - - - Joseph Percivalllinkedin.com/in/Percivalle: joepercivall@yahoo.com
 

    On Thursday, January 14, 2016 7:04 AM, sudeep mishra <sudeepshekharm@gmail.com>
wrote:
 

 Thanks Joe. The GetDistributedMapCache seems to be working fine. 
Is there a way to clear DistributedMapCache on demand?
Regards,
Sudeep
On Thu, Jan 14, 2016 at 12:42 PM, sudeep mishra <sudeepshekharm@gmail.com> wrote:

Upon building the repository we get different .nar files which can be updated in the lib for
my requirement. Thanks for your help.
On Thu, Jan 14, 2016 at 9:27 AM, sudeep mishra <sudeepshekharm@gmail.com> wrote:

Is it possible to build the code for only a particular processor? Just curious if we can build
and deploy a particular processor in an existing NiFi environment.
On Wed, Jan 13, 2016 at 9:33 PM, sudeep mishra <sudeepshekharm@gmail.com> wrote:

Thanks Joe. I will try out the patch.
On Wed, Jan 13, 2016 at 9:31 PM, Joe Percivall <joepercivall@yahoo.com> wrote:

You would need to clone the nifi source from github and then apply the patch using git.

Here is how to clone a repo: https://help.github.com/articles/cloning-a-repository/
Along with the nifi repo itself: https://github.com/apache/nifi

and how to apply a patch: http://makandracards.com/makandra/2521-git-how-to-create-and-apply-patches

Let me know if you have any other questions,
Joe
- - - - - -
Joseph Percivall
linkedin.com/in/Percivall
e: joepercivall@yahoo.com



On Wednesday, January 13, 2016 10:56 AM, sudeep mishra <sudeepshekharm@gmail.com> wrote:



Thank you very much Joe.

Can you please let me know how I can use the .patch file? I am using the NiFi via the binaries...
Do I need to setup the source code and build the same along with the patch?

Thanks & Regards,

Sudeep


On Wed, Jan 13, 2016 at 9:02 PM, Joe Percivall <joepercivall@yahoo.com> wrote:

Hello Sudeep,
>
>I put up a patch on the GetDistributedMapCache ticket[1]. Let me know what you think.
>
>The PutDistributedMapCache processor and GetDistributedMapCache work with the data as
a byte[] so it should be format agnostic. That being said it will be up to you to know what
is in there in order to use it later.
>
>[1] https://issues.apache.org/jira/browse/NIFI-1382
>
>Joe
>- - - - - -
>Joseph Percivall
>linkedin.com/in/Percivall
>e: joepercivall@yahoo.com
>
>
>
>
>On Tuesday, January 12, 2016 11:34 PM, sudeep mishra <sudeepshekharm@gmail.com>
wrote:
>
>
>
>Thanks Joe.
>
>I do not have specific configuration as of now as I am still exploring NiFi. Though I
think it would be helpful to let user store and retrieve the cache values in different formats
json, avro etc.
>
>Thanks & Regards,
>
>Sudeep
>
>
>
>
>
>On Tue, Jan 12, 2016 at 9:15 PM, Joe Percivall <joepercivall@yahoo.com> wrote:
>
>Hello Sudeep,
>>
>>
>>We are currently lacking a "GetDistributedMapCache" processor that corresponds to
the "PutDistributedMapCache". I created a ticket[1] and will be working on it today. If you
have any comments, configuration suggestions, etc. please let me know or comment on the ticket.
>>
>>
>>[1] https://issues.apache.org/jira/browse/NIFI-1382
>>
>>Joe
>>- - - - - -
>>Joseph Percivall
>>linkedin.com/in/Percivall
>>e: joepercivall@yahoo.com
>>
>>
>>
>>
>>
>>On Tuesday, January 12, 2016 9:46 AM, sudeep mishra <sudeepshekharm@gmail.com>
wrote:
>>
>>
>>
>>Thanks Matt.
>>
>>
>>In my data flow I am expected to perform certain validations on data. I am loading
some SQLServer data into HDFSusing Sqoop (not part of NiFi flow). For each record in HDFS
file I have to query another database and then save the validated record again in HDFS which
will be processed bysome Spark jobs.
>>
>>
>>Since I have to query for each record thus I was planning to cache the database records
against which I have to validate the HDFS. Thus I was evaluating the DistributedCacheServer.
But looks like its purpose is different. Alternatively can we integrate Redis or another distributed
cache with NiFi as I do not see any processor for it.
>>
>>
>>Appreciate your help.
>>
>>
>>Thanks & Regards,
>>
>>
>>Sudeep
>>
>>
>>
>>
>>On Tue, Jan 12, 2016 at 6:59 PM, Matthew Clarke <matt.clarke.138@gmail.com>
wrote:
>>
>>Sudeep,
>>>       I was a little off on my second scenario.  The detectduplicate processor
uses the distributedcache service all on its own.. Files that are route through it are loaded
into the cache if they do not already exist in the cache.  if they do already exist they
are routed to duplicate.  The putDistributedCache processor was a community contribution
to which there are no processor that make use of the info that it caches.
>>>
>>>       We should probably build a processor that would make use of the data
that can be loaded by the putDistributeCache processor.  Is there a particular use case you
are trying to solve where this would be applicable?
>>>
>>>
>>>Thanks,
>>>Matt
>>>
>>>
>>>On Tue, Jan 12, 2016 at 8:11 AM, Matthew Clarke <matt.clarke.138@gmail.com>
wrote:
>>>
>>>Sudeep,
>>>>    The DistributedMapCache is typically used to prevent the consumption
of duplicate data by some of the ingest type processors (GetHBASE, ListHDFS, and ListSFTP). 
NiFi uses the service to keep a listing of what has been consumed so the same files are not
consumed multiple times. The Service can also be used to detect if duplicate data already
exists within a NiFi Instance or cluster. This would be the scenario where some source is
pushing data to your NiFi and perhaps they push the same data more than once. You want to
catch these duplicates so you can perhaps kick them out of your flow. For this you would use
the PutDistributedCache processor to cache all incoming data and then use the DetectDuplicate
processor to find those duplicates.
>>>>
>>>>    Was there a different use case you were looking to solve using the Distributed
cache service?
>>>>
>>>>
>>>>Thanks,
>>>>Matt
>>>>
>>>>
>>>>On Tue, Jan 12, 2016 at 4:36 AM, sudeep mishra <sudeepshekharm@gmail.com>
wrote:
>>>>
>>>>Hi,
>>>>>
>>>>>
>>>>>I can cache some data to be used in NiFi flow. I can see the processor
PutDistributedMapCache in the documentation which saves key-value pairs in DistributedMapCache
for NiFi but I do not see any processor to red this data. How can I read data from DistributedMapCache
in my data flow?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>Thanks & Regards,
>>>>>
>>>>>
>>>>>Sudeep Shekhar Mishra
>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>>--
>>
>>Thanks & Regards,
>>
>>
>>Sudeep Shekhar Mishra
>>
>>
>>+91-9167519029
>>sudeepshekharm@gmail.com
>>
>>
>
>
>--
>
>Thanks & Regards,
>
>Sudeep Shekhar Mishra
>
>+91-9167519029
>sudeepshekharm@gmail.com
>


--

Thanks & Regards,

Sudeep Shekhar Mishra

+91-9167519029
sudeepshekharm@gmail.com




-- 
Thanks & Regards,
Sudeep Shekhar Mishra
+91-9167519029sudeepshekharm@gmail.com



-- 
Thanks & Regards,
Sudeep Shekhar Mishra
+91-9167519029sudeepshekharm@gmail.com



-- 
Thanks & Regards,
Sudeep Shekhar Mishra
+91-9167519029sudeepshekharm@gmail.com



-- 
Thanks & Regards,
Sudeep Shekhar Mishra
+91-9167519029sudeepshekharm@gmail.com

  
Mime
View raw message