directory-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kiran Ayyagari <kayyag...@apache.org>
Subject Re: Multi-Master Replication issues - Memory and out of sync
Date Sat, 15 Aug 2015 00:48:36 GMT
On Sat, Aug 15, 2015 at 2:19 AM, Ezsra McDonald <ezsra.mcdonald@gmail.com>
wrote:

> It looks like one instance is out of sync. How do I get it back in sync?
>
> you can force a re-sync by removing the partition data *and* the
contextEntryCSN attribute from
the partition root, but will be slower than the below method

> I was going to shutdown the bad node and one of the good nodes. Then copy
> the partition form the good node. I noticed there is a syncrepl-data folder
> that has journals in it. Do those need to be copied as well?
>
> no, exclude the sync-repl data

any success in getting the memory dump on the node which had OOM error
before restarting?

>
> On Thu, Aug 13, 2015 at 9:41 PM, Kiran Ayyagari <kayyagari@apache.org>
> wrote:
>
> > On Thu, Aug 13, 2015 at 11:11 PM, Ezsra McDonald <
> ezsra.mcdonald@gmail.com
> > >
> > wrote:
> >
> > > First, How much memory should a Multi-Master node require? The Master
> > pool
> > > is made up of four nodes. I currently have -Xms1024m and -Xmx2048m. I
> > seem
> > > to be running out of memory:
> > >
> > > this should be enough, not a whole lot of entries should live in the
> > memory
> >
> > >          Exception in thread "pool-2-thread-14"
> > java.lang.OutOfMemoryError:
> > > GC overhead limit exceeded
> > >
> > can you please take* a memory dump of the server process? and attach it
> to
> > a jira ticket.
> >
> > * please follow this doc if needed
> > http://blogs.atlassian.com/2013/03/so-you-want-your-jvms-heap/
> >
> > >
> > > I have more than 330k entries in my LDAP partition.
> > >
> > >
> > > Next, I collected the contextCsn values over a few seconds. I used
> iTerm
> > to
> > > execute the commands on all nodes simultaneously. I am confused by
> what I
> > > am seeing. Do these values make any since?
> > >
> > yes, all the nodes are appearing to be in  sync based on the given values
> >
> > >
> > >   NODE 1A NODE 2A NODE 1B NODE 2B 1
> > > 20150813111645.934000Z#000000#001#000000
> > > 20150813130350.592000Z#000000#001#000000
> > > 20150813130350.592000Z#000000#001#000000
> > > 20150813111652.523000Z#000000#001#000000 2
> > >
> > what partition the above value belongs to?, I assume this is not a
> > replicated partition
> >
> > > 20150813111645.934000Z#000000#001#000000
> > > 20150813111645.934000Z#000000#001#000000
> > > 20150813130350.592000Z#000000#001#000000
> > > 20150813111645.934000Z#000000#001#000000 3
> > > 20150813111645.934000Z#000000#001#000000
> > > 20150813111645.934000Z#000000#001#000000
> > > 20150813130350.592000Z#000000#001#000000
> > > 20150813111645.934000Z#000000#001#000000 4
> > > 20150813142625.893000Z#000000#001#000000
> > > 20150813111645.934000Z#000000#001#000000
> > > 20150813130350.920000Z#000000#001#000000
> > > 20150813111645.934000Z#000000#001#000000 5
> > > 20150813130356.420000Z#000000#001#000000
> > > 20150813130350.592000Z#000000#001#000000
> > > 20150813111645.934000Z#000000#001#000000
> > > 20150813111645.934000Z#000000#001#000000
> > >
> >
> >
> >
> > --
> > Kiran Ayyagari
> > http://keydap.com
> >
>



-- 
Kiran Ayyagari
http://keydap.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message