incubator-esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Kohler <markus.koh...@gmail.com>
Subject Re: Memory usage analysis
Date Mon, 23 Nov 2009 20:15:07 GMT
Hi Vassil,
Taking this to the ESME mailing list.
See my comments below...

Markus

"The best way to predict the future is to invent it" -- Alan Kay


On Sun, Nov 22, 2009 at 2:53 PM, Vassil Dichev <vdichev@apache.org> wrote:

> @Markus, thanks for the high-quality analysis, I'm very interested in
> learning more about memory usage. I'm especially interested in seeing
> what typically wastes a lot of memory in a Scala application compared
> to a Java application, since Scala uses quite a few more objects,
> which come from its collections and anonymous function blocks.
>
>
We could schedule a short introduction. The basic Memory Analyzer
functionality is pretty easy.  We could do it theoretically even today or on
Wednesday evening? The only technical problem is that we would need some
kind of conferencing solution.



> @Dick: Regarding textile, I actually intended to have a more
> lightweight formatting in ESME along the lines of bold and italics. A
> major factor for adding lift-textile is because it was so easy- just
> one short line of code. Later on I found out that some of the features
> of Textile don't even fit with ESME very well (headings? tables?) and
> some caused conflicts with ESME's own parser (links to images, new
> paragraphs). So I'm all for implementing some really light formatting
> in ESME's own message parser.
>

Agreed. Any preferences regarding the syntax? I like markdown
http://daringfireball.net/projects/markdown/ but any other would probably do
it as well because we would only need basic formatting for the beginning.



>
> @David Regarding your note about inefficient parsing: is this because
> of Scala's combinator library? What would you recommend for fast
> parsing? Waiting for Scala 2.8 packrat parsers? Or writing your own
> lexer with Antlr? Or using your own hand-rolled lexer? I'm interested
> not only in casual lightweight parsing, but also in something like
> parsing of scala syntax.
>
>
Those of you, who have access to NW CE 7.1, I think you should find the
executable "jvmprofiler" in the SAP jvm directory of their CE installation.
I may  have to redo the profiling because I probably used a newer profiler
version, but otherwise you could read my stored profiling results with that
jvmprofiler from CE.
In the meantime I will export some of the results to html.


Markus

Vassil
>
> P.S. I'm OK with sharing my email with the mailing list, should we
> continue the conversanion there?
>
>
> On Sun, Nov 22, 2009 at 8:48 AM, Richard Hirsch <hirsch.dick@gmail.com>
> wrote:
> > @Markus: Awesome work. It is great to find out stuff like this now
> > rather than when ESME is in productive use somewhere and we have
> > performance problems.
> >
> > Like I've said before this is probably of interest to the lift
> > community (and perhaps the scala community as well).  I'd like to move
> > this stuff to the wiki and reference it on the esme-dev list.
> >
> > I'll do this on Monday unless anyone has any problems with making this
> public.
> >
> > Re textile: If this is such a memory hog, we should make its usage
> > based on a property. Then users can decide whether they want to use it
> > or not.
> >
> > D.
> >
> > On Sun, Nov 22, 2009 at 3:33 AM, David Pollak
> > <feeder.of.the.bears@gmail.com> wrote:
> >> Good stuff.
> >> I wonder if we should attempt to intern some strings.
> >> We might also decide to move from List to Array for certain data
> structures
> >> (this will reduce the cons [ :: ] objects).
> >> I'm all for this stuff on the public esme dev list if you're okay with
> it.
> >> On Sat, Nov 21, 2009 at 3:09 PM, Markus Kohler <markus.kohler@gmail.com
> >
> >> wrote:
> >>>
> >>> Hi all,
> >>> I did a few more investigations regarding memory allocation using the
> SAP
> >>> JVM profiler and memory usage using the Eclipse Memory Analyzer (MAT).
> >>> I already mentioned that I found a memory allocation issue with the
> >>> textile formatter, but I just saw another issue when checking the
> memory
> >>> usage with MAT, that I think might be more severe.
> >>> So let's start with this.
> >>> I rerun the test to create the 300+x users using Jetty 7 and my local
> >>> machine. This worked without any problems, and it seems the number of
> >>> threads would not increase to more than 100. I then logged on all user
> and
> >>> afterwards triggered a heap dump.
> >>> The memory usage overall was with about 42 Mbyte relatively low.
> >>> I applied my favorite trick
> >>> (
> http://kohlerm.blogspot.com/2008/05/analyzing-memory-consumption-of-eclipse.html
> )
> >>> searching for duplicates of Strings.
> >>> See the attachment duplicatedStrings.zip for the result.
> >>> How to interpret this report?
> >>> Let's check the first line:
> >>> String Value | Objects | Shallow Heap | Retained Heap
> >>> -----------------------------------------------------
> >>> Europe/Berlin|  11.729 |      281.496 |    >= 750.656
> >>> -----------------------------------------------------
> >>>
> >>> This means that "Europe/Berlin" appears 11729 times in memory at the
> point
> >>> in time the heap dump was taken. If those Strings could be
> removed 750.656
> >>> or more bytes could be reclaimed.  This copies seems to come
> >>> from net.liftweb.http.RenderOut btw.
> >>> For shallow versus retained size/heap
> >>> check
> http://kohlerm.blogspot.com/2009/02/how-to-really-measure-memory-usage-of.html
> .
> >>> If you look further below in the table you can see that it seems that
> each
> >>> message is there 323 times, which is equal to the number of users.
> >>> Note that I manually created only 2 messages "Test123" and "Test1234",
> the
> >>> others are just status updates.
> >>> This means that with n users potentially producing n times more
> messages
> >>> than a single user, that will not only need n times more memory, but n
> times
> >>> n e.g. O(n^2).
> >>> This will kill us as the number of users goes up. I understand that in
> >>> Scala a lot of objects can be immutable, but I still think it doesn't
> make
> >>> sense to hold copies of those Strings.
> >>> There are long chains of scala.$colon$colon everywhere and this class
> >>> seems to cause quite some overhead ( linked lists I guess)
> >>> Check histogram.zip. The shallow size is already 10% of the heap.
> >>> The dominator tree
> >>> (
> http://kohlerm.blogspot.com/2009/02/memory-leaks-are-easy-to-find.html
> )  grouped
> >>> by class (dominator_tree.zip) shows that 63% of the memory is spend in
> the
> >>> the org.apache.esme.comet.PublicTimeline. Haven't looked into the
> details,
> >>> but I guess that overhead is caused by  those String duplicates.
> >>>
> >>> I can make the heap dump available. If someone tells me where I can
> upload
> >>> a 40Mbyte file, I will do so.
> >>> There's certainly more to find in this heap dump alone. I haven't for
> >>> example executed some of my special MAT commands yet.
> >>> If you have questions, feel free to ask. Best time would be Monday
> night
> >>> (WDF time), because is esme time for me.
> >>> I will also document the textile issue then.
> >>> Regards,
> >>> Markus
> >>>
> >>
> >>
> >>
> >> --
> >> Lift, the simply functional web framework http://liftweb.net
> >> Beginning Scala http://www.apress.com/book/view/1430219890
> >> Follow me: http://twitter.com/dpp
> >> Surf the harmonics
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message