esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Hirsch <>
Subject Re: Performance update: Message size in memory
Date Tue, 08 Dec 2009 10:30:38 GMT
org.tartarus.snowball.ext.PorterStemmer is from the compass search.
Maybe we can configure it, so that it is not retained after usage.


On Tue, Dec 8, 2009 at 1:03 AM, Markus Kohler <> wrote:
> Hi all,
> I've been busy otherwise, and therefore didn't find much time for ESME last
> week.
> I tried a few things with regards to performance.
> As you all noticed the performance on the performance instance is currently
> excellent.
> I tried various approaches to measure it, but most failed due to the coment
> requests, which the tools I usually use don't like.
> The best I could get are some numbers from the Firebug Firefox plugin.  It
> seems that the response time for entering a message until it appears in the
> users timeline is around 350ms, which is really excellent. It will be even
> harder to measure (using the browser) how long it takes for a message from
> one user to the user. I'm not sure how to do that yet. I tested manually
> sending messages from chrome to firefox and it' s really fast.
> I also let one of the 300+x Users send 1000 messages and did some heap
> dumps.
> I'm not yet fully through it but it's already clear that messages take up
> too much space.
> Around 1400 messages would need  9,3 Million bytes which means that in
> average one messages needs 6Kbyte!
> Ok there were probably also a lot of relatively long update status messages,
> but still I think this is too much.
> The reason seems to be that The messages still retain an instance to the
> Stemmer (org.tartarus.snowball.ext.PorterStemmer) which alone takes 2 Kbyte.
> Do we really need this Stemmer after we ran it?
> Another reason is that scala.xml.Elem is referenced in the toXML field. I
> guess this is the result of parsing XML. Not sure whether this is still
> needed after it's done, but storing DOM like structures is for sure not
> memory efficient. originialXML looks similiar.
> It would be important to get these numbers down, otherwise we will be killed
> by memory usage as soon as we get a lot of messages send.
> I also asked on the Scala list about the loadXML function accessing the
> filesystem, but someone claimed this would not be the case in trunk and
> asked for the version. So maybe they can backport  a fix for this.
> I seem to remember during some profiling that this function is still used.
> Haven't had any time to draft a blog, but I hope I can start with that on
> Wednesday or Thursday.
> Regards,
> Markus
> "The best way to predict the future is to invent it" -- Alan Kay

View raw message