esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Kohler <markus.koh...@gmail.com>
Subject Re: We are down to 4.6 Mbyte
Date Mon, 30 Nov 2009 09:40:51 GMT
Hi Vassil,

Thanks a lot for sharing your analysis!
Bugs happen and especially in the area memory allocation issues I have seen
in the past that bugs were undetected for quite some time. It seems to me
that his is caused by todays java implementations being freaking fast in
allocating memory (around 10 CPU cycles for one allocation) and often very
fast in reclaiming it, thanks to generational Garbage Collectors.
The SAP JVM has API's for cheaply getting memory allocation information from
within a Junit test, guess who asked for this feature? ;)

Regards,
Markus
"The best way to predict the future is to invent it" -- Alan Kay


On Sun, Nov 29, 2009 at 2:03 PM, Vassil Dichev <vdichev@apache.org> wrote:

> > I haven't looked in detail into last nights profiling  results, but it
> seems
> > we are down to 4.6 Mbyte! That's an 5000x improvement!
> > I plan to document the details on Monday night. If there's time left I
> will
> > also start with drafting a longer blog post. It would be great if Vassil
> > would provide a short description/explanation of his changes.
>
> OK, this is going to be embarassing for me, but this is not actually
> an improvement, but a return to the performance capabilities of ESME
> from several months ago.
>
> I'm not surprised that it was 5000x worse, because every time the
> public/friends' timeline was displayed for any user, every message was
> fetched from the database, converted to XML, transformed into XHTML
> and JSON... Not only that, but every time a new message has been
> received, this would force the timelines of all users, who receive the
> message, to be rerendered again, which means again reloading from DB
> and the same XML acrobatics for all 20 messages of the 2 timelines,
> which causes 40 messages to be processed for each user.
>
> To top it off, when the Textile parser was activated, its overhead was
> multiplied 40 times per user, which for 300 users means 12000 messages
> rerendered, just because one user decided to send a message! Yes, this
> sounds horrible. David was indeed correct that the Textile parser
> itself was not the main culprit, but just magnifying the effects of a
> more serious bug.
>
> The problem was the Message.findMessages method. It is supposed to
> cache messages based on a LRU strategy. When I introduced access
> pools, messages had to be controlled not only when loading them from
> the DB, but from the cache. So I discarded the messages from the
> temporary structure which had to be returned to the user. The messages
> which were discarded would go to the finder method, where the query
> constructed would make sure only messages from valid pools would be
> returned (inefficiency one). Furthermore, I allowed a bug where
> messages from the public pool would also always be discarded from the
> cache and fetched from the DB (inefficiency two). So stuff would work,
> but the cache wasn't used in practice.
>
> In conclusion, this is just one more argument for keeping messages in
> memory, instead of fetching them from the DB.
>
> Another important conclusion is that performance tests are just as
> important as unit and integration tests and can uncover functional
> problems too, especially with caches.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message