nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Gresock <jgres...@gmail.com>
Subject Re: DistributedMapCacheServer question
Date Wed, 08 Mar 2017 11:06:23 GMT
Thanks Bryan, I'll start looking through the PersistenceMapCache.  This
morning I checked back and the snapshot file now has 2.9 million keys in it.

On Tue, Mar 7, 2017 at 4:39 PM, Bryan Bende <bbende@gmail.com> wrote:

> Joe,
>
> I'm not that familiar with the persistence part of the DMCS, although
> I do know that it uses the write-ahead-log that is also used by the
> flow file repo.
>
> The code for PersistenceMapCache is here:
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-
> services/nifi-distributed-cache-services-bundle/nifi-
> distributed-cache-server/src/main/java/org/apache/nifi/
> distributed/cache/server/map/PersistentMapCache.java
>
> It looks like the WAL is check-pointed during puts here:
>
> final long modCount = modifications.getAndIncrement();
> if ( modCount > 0 && modCount % 100000 == 0 ) {
>     wali.checkpoint();
> }
>
> And during deletes here:
>
> final long modCount = modifications.getAndIncrement();
> if (modCount > 0 && modCount % 1000 == 0) {
>     wali.checkpoint();
> }
>
> Not sure if that was intentional that put operations check point every
> 100k and and deletes check point every 1k.
>
> Maybe Mark or others could shed some light on why the snapshot is
> reaching 3GB in size.
>
> -Bryan
>
>
> On Tue, Mar 7, 2017 at 7:07 AM, Joe Gresock <jgresock@gmail.com> wrote:
> > Hi folks,
> >
> > Is there a technical description of how the DistributedMapCacheServer
> > (DMCS) persistence works?  I've noticed the following on our cluster:
> >
> > - I have the DMCS configured on port 4557 as FIFO with max 100,000
> entries,
> > and have specified a persistence directory
> > - I am using DetectDuplicate with the DMCS, and the individual key length
> > is 80 bytes, with a Description length of 1 byte.  By my count, this
> should
> > result in a pure data size of 7.7MB.
> > - I notice that the snapshot file in the persistence directory appears to
> > continue growing past the 100,000 limit, though this may be expected
> > depending on the implementation.  Since I know that the key will contain
> > "json" in it, I can run the following command to count the number of
> > possible keys in the snapshot file (though I'm not sure if this is a good
> > way of measuring how many keys are actually cached): grep -oa json
> snapshot
> > | wc -l
> > - When the snapshot file reaches around 3GB, the DMCS has a hard time
> > staying up, and frequently becomes unreachable (netstat -tulpn | grep
> 4557
> > shows nothing).  At this point, in order to restore functionality I
> delete
> > the persistence directory and let it start over.
> >
> > So my main questions are:
> > - How are the snapshot and partition files structured, and how can I
> > estimate how many keys are actually cached at a given time?
> > - Is the described behavior indicative of the cache exceeding the
> > configured max number of keys?
> >
> > Thanks,
> > Joe
> >
> > --
> > I know what it is to be in need, and I know what it is to have plenty.  I
> > have learned the secret of being content in any and every situation,
> > whether well fed or hungry, whether living in plenty or in want.  I can
> do
> > all this through him who gives me strength.    *-Philippians 4:12-13*
>



-- 
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message