esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vassil Dichev <vdic...@apache.org>
Subject Re: Performance update: Message size in memory
Date Wed, 09 Dec 2009 07:46:56 GMT
Markus,

First of all, a note about the KB/message statistics: this is only
valid as long as you get messages from the cache! Currently the cache
size is set to 10,000, so you will see a drop in memory usage for
message numbers, which exceed this size. Processing messages would
also necessarily become slower.

The simplest strategies for the stemmer would be:
1. Move the stemmer to the companion object
2. Create a new stemmer every time it's needed

By doing a naive test with 100,000 invocations of stem for the same
stemmer and creating 100,000 stemmer objects it seems that
instantiation takes almost double time. So I'm not sure contentioun
would be much of an issue, besides the only time a stemmer is needed
is for search and the word frequency cloud. These are not specific to
a particular message, so can be (and should be) moved to the the
companion object, too. Furthermore, search is done in a compass
transaction anyway.

We could also have some type of pooling, but I'm not sure how
efficient it would be. This definitely needs some benchmarks before we
try to optimize too much.

What do you think?

Mime
View raw message