activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Bain <tb...@alumni.duke.edu>
Subject Re: Storing message off heap, message compression, and storing the whole message as headers.
Date Mon, 20 Apr 2015 13:24:44 GMT
I'm confused about what would drive the need for this.

Is it the ability to hold more messages than your JVM size allows?  If so,
we already have both KahaDB and LevelDB; what does Chronicle offer that
those other two don't?

Is it because you see some kind of inefficiency in how ActiveMQ uses memory
or how the JVM's GC strategies work?  If so, can you elaborate on what
you're concerned about?  (You made a statement that sounds like "the JVM
can only use half its memory, because the other half has to be kept free
for GCing", which doesn't match my experience at all.  I've observed G1GC
to successfully GC when the heap was nearly 100% full, I'm certain it's not
a problem for CMS because CMS is a non-compacting Old Gen GC strategy -
that's why it's subject to fragmentation - and I believe that ParallelGC
does in-place compaction so it wouldn't require additional memory though I
haven't directly observed it during a GC.  Please either correct my
interpretation of what your statement or provide the data you're basing it
on.)

One difference in GC behavior with what you're proposing is that under your
algorithm you'd GC each message at least twice (once when it's received and
put into Chronicle, and once when it's pulled from Chronicle and sent
onward, plus any additional reads needed to operate on the message such as
if a new subscriber with a non-matching selector connected to the broker)
instead of just once under the current algorithm.  On the other hand, your
GCs should all be from Young Gen (and cheap) whereas the current algorithm
would likely push many of its messages to Old Gen.  Old Gen GCs are more
expensive under ParallelGC, though they're no worse under G1GC and CMS.  So
it's a trade-off under ParallelGC (maybe better, maybe worse) and a loss
under the other two.

One other thing: this would give compression at rest, but not in motion,
and it comes at the expense of two serialization/deserialization and
compression/decompression operations per broker traversed.  Maybe being
able to store more messages in a given amount of memory is worth it to you
(your volumes seem a lot higher than ours, and than most installations'),
but latency and throughput matter more to us than memory usage so we'd live
with using more memory to avoid the extra operations.

The question about why to use message bodies at all is an interesting one,
though the ability to compress the body once and have it stay compressed
through multiple network writes is a compelling reason in the near term.

Tim
On Apr 19, 2015 6:06 PM, "Kevin Burton" <burton@spinn3r.com> wrote:

> I’ve been thinking about how messages are stored in the broker and ways to
> improve the storage in memory.
>
> First, right now, messages are stored in the same heap, and if you’re using
> the memory store, like, that’s going to add up.  This will increase GC
> latency , and you actually need 2x more memory because you have to have
> temp memory set aside for GCs.
>
> I was thinking about using Chronicle to store the messages off heap using
> direct buffers.  The downside to this is that the messages need to be
> serialized/deserialized with each access. But realistically that’s probably
> acceptable because you can do something like 1M message deserializations
> per second.  Which is normally more than the throughput of the broker.
>
> Additionally, chronicle supports zlib or snappy compression on the message
> bodies.  So, while the broker supports message compression now, it doesn’t
> support this feature on headers.
>
> This would give us header compression!
>
> The broker would transparently decompress the headers when reading the
> message.
>
> This then begs the question, why use message bodies at all?  Why not just
> store an entire message as a set of headers?
>
> If you need hierarchy you can do foo.bar.cat.dog style header names.
>
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message