esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Kohler <markus.koh...@gmail.com>
Subject Re: Further analysis of the GC issue
Date Thu, 26 Nov 2009 13:22:33 GMT
Hi Michael,
No problem :-)



Regards,
Markus

"The best way to predict the future is to invent it" -- Alan Kay


On Thu, Nov 26, 2009 at 2:12 PM, Bechauf, Michael
<michael.bechauf@sap.com>wrote:

> Thanks Markus. That certainly sounds much better. I was confused
> yesterday already because 23 GByte memory would be a little difficult to
> create when not even the operating system can handle such size. I should
> have asked right away. Blame it on jetlag.
>
> -Michael
>
> -----Original Message-----
> From: Markus Kohler [mailto:markus.kohler@gmail.com]
> Sent: Thursday, Nov 26, 2009 1:04 AM
> To: esme-dev@incubator.apache.org
> Subject: Re: Further analysis of the GC issue
>
> Hi Michael,
> Good to see you here!
>
> "Memory Analyzer"? that's me ;-)
>
> The 23 Gbyte are not "retained" at one point in time, but they are the
> sum
> of all temporary allocated objects, most of memory, (or all of it, there
> doesn't seem to be an obvious memory leak), are gone within a
> millisecond.
> I'm confident that this value can be decreased to 90Mbyte and can be
> further
> improved down to a few MByte (or even less). We already know that the
> 90Mbyte are mostly caused be an inefficient textile parser.
>
> I also used the Memory Analyzer to look at how much memory is retained,
> e.g.
> still in use/referenced after the user interaction has been finished.
> The
> report is here
> http://cwiki.apache.org/confluence/display/ESME/Performance+test+-+2009-
> 11-22
> Also there's room for improvement, potentially caused by the same bug
> that
> turned 90Mbyte into 23Gbyte, I don't see any major issues yet with
> regards
> to memory usage.
>
> This is also related to the state less versus state full discussion, ATM
> the
> amount of state needed for one user is already very low ( a few hundred
> kByte), at least compared to what I'm used to with Enterprise
> Applications.
> It is at least an order of magnitude lower, which can only partially
> explained by ESME being less complex than the typical Enterprise app.
> So far I don't see any major road block from the design perspective that
> would stop us from scaling very well.
>
> In my experience, it's quite normal that as soon as someone with a
> little
> bit of experience in performance takes as closer look at a software,
> that a
> few dramatic improvements can be made. That makes working as a
> performance
> analysis expert so gratifying. You suggest a few improvements, which
> have an
> dramatic impact, and then you walk away before it gets too complicated
> ;-)
> No, that's not my intention here :-)
>
>
> Markus
>
> "The best way to predict the future is to invent it" -- Alan Kay
>
>
> On Thu, Nov 26, 2009 at 6:04 AM, Bechauf, Michael
> <michael.bechauf@sap.com>wrote:
>
> > David,
> >
> > well, "dead wrong" is a strong expression; hopefully I'm still
> breathing. I
> > don't want to judge without having looked at the code myself, but I
> have no
> > idea how a massive multi-user system could possibly be designed with
> state
> > where per-user information is kept in memory for a certain time. I
> mean, 23
> > GB allocated - that's tough for an SAP transaction server that is not
> > mutlithreaded and where the memory management is highly optimized
> based on
> > shared memory that the work processes can attach to, or rolled out to
> a file
> > if unused for a whilet. It is, however, deadly for a VM that was never
> > designed for such memory consumption and where a GC run can halt the
> server.
> >
> > Anyway, I'll study this a bit more, particularely the Scala
> architecture. I
> > heard many good things about Scala, but in the end it's all translated
> to
> > things a VM can understand, and I hope Scala does a good enough job
> managing
> > this load in a transparent way.
> >
> > -Michael
> >
> >
> > ----- Original Message -----
> > From: David Pollak <feeder.of.the.bears@gmail.com>
> > To: esme-dev@incubator.apache.org <esme-dev@incubator.apache.org>
> > Sent: Wed Nov 25 23:00:20 2009
> > Subject: Re: Further analysis of the GC issue
> >
> > On Wed, Nov 25, 2009 at 7:16 PM, Bechauf, Michael
> > <michael.bechauf@sap.com>wrote:
> >
> > > Wasn't this exactly the kind of stuff that the Eclipse Memory
> Analyzer -
> > > donated by SAP - was supposed to fix ? A heap of that size for a
> still
> > > moderate number of 300 users is crazy, so either there is stuff like
> > > circular references that hog memory, or the design model is
> fundamentally
> > > flawed. I don't understand why ESME needs "sessions" ? How can a
> > scaleable
> > > server be created if each user will allocate memory until some
> timeout.
> > In a
> > > world of stateless browser-based UIs that's not going to work.
> > >
> >
> > You're actually dead wrong about this.  "Stateless" is not... it's
> just
> > pushing state and cache someplace else (the RDBMS, memcached, etc.).
> > "Stateless" will lead to radical performance problems.  "Stateless"
> merely
> > moves the caching decisions into code you don't control.  I dealt with
> this
> > issue first-hand while helping a popular micro-blogging site migrate
> from a
> > "stateless" to a Scala-based backend.  I'm dealing with this issue
> > first-hand helping another popular site that's experiencing
> exponential
> > growth migrate away from "push everything back to the RDBMS and hope
> for
> > the
> > best."
> >
> > My original design for ESME is stateful.  My original design for ESME
> is
> > based on lessoned learned in this very space and was oriented to have
> > things
> > intelligently cached so that the caching is not based on RDBMS
> indexes.
> >  I'm
> > not sure what happened to cause the particular issues, but it seems
> like
> > folks are loading messages from the RDBMS rather than asking the
> message
> > cache for them.
> >
> >
> > >
> > > Time for me to look at that code ...
> > >
> > > -Michael
> > >
> > >
> > > ----- Original Message -----
> > > From: Markus Kohler <markus.kohler@gmail.com>
> > > To: esme-dev@incubator.apache.org <esme-dev@incubator.apache.org>
> > > Sent: Wed Nov 25 12:14:58 2009
> > > Subject: Further analysis of the GC issue
> > >
> > > Hi all,
> > > the Garbage Collector issue I was talking about is reproducible.
> > > I've uploaded an annotated GC graph to
> > >
> > >
> >
> http://picasaweb.google.com/lh/photo/wB-RRtb0wIVfpxJkTJPNuw?authkey=Gv1s
> RgCOve7LThpfvXsQE&feat=directlink
> > >
> > > I think the "LOGON" phase where I logon all the 300 users looks ok
> (given
> > > that probably textile formatting is involved) but the phase where
> just
> > one
> > > user sends one message is certainly not looking good.
> > >
> > > I took the profiler and the result is a bit shocking. For that one
> > message,
> > > 881.000.000 objects weighting  23,2 Gbyte where allocated (and
> reclaimed
> > > afterwards). My former record was 2Gbyte ;-)
> > >
> > > Fortunately I have a theory what happens, without looking into the
> > > code,yet,
> > > so take it with a grain of salt. It seems that the public time line
> for
> > all
> > > users is re-rendered, because 99% of the allocations come
> > > from org.apache.esme.comet.PublicTimeline.render(). I guess all the
> > actors
> > > for all the users are sitting there, not knowing that the user has
> closed
> > > the browser, because the user session has not yet expired.
> > >
> > > I wonder how we get around this with a real "push" model. If the
> browser
> > > would ask for updates this rendering could be done lazily. Or can we
> > "ping"
> > > the browser and check whether it responds?
> > > On the other side. It should also not be necessary the re-render the
> > > message
> > > again and again because the result will be the same.
> > >
> > > I will send Richard some attachments. Not sure whether you will need
> > them,
> > > they look very similar to the ones we already have.
> > >
> > > BTW, we should definitely check the use
> > > of scala.xml.XML$.loadString(java.lang.String)
> > > It's creating a new Parser each time, which is a bit costly because
> it
> > > allocates a new Buffer each time and also hits the disk, when
> searching
> > for
> > > the name of the Java class.
> > >
> > > Greetings,
> > > Markus
> > >
> > >
> > >
> > > "The best way to predict the future is to invent it" -- Alan Kay
> > >
> >
> >
> >
> > --
> > Lift, the simply functional web framework http://liftweb.net
> > Beginning Scala http://www.apress.com/book/view/1430219890
> > Follow me: http://twitter.com/dpp
> > Surf the harmonics
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message