hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject [DISCUSS/VOTE] Refactor of message queue .
Date Fri, 29 Aug 2014 00:09:03 GMT
First of all, Our main problem is that current system requires a lot
of memory space, especially graph module. As you already might know,
the main memory consumer is the message queue.

To solve this problem, we considered the use of local disk space e.g.,
DiskQueue and SpillingQueue. However, those queues are basically not
able to bundle and group the messages by destination server, in
memory-efficient way. So, I don't think this approach is right way.

My solution for saving the memory usage and the performance
degradation, is storing serializable message objects as a byte array
in queue. In graph case, 3X ~ 6X memory efficiency is expected than
before (GraphJobMessage consists of destination vertex ID and message
value multi-objects).

In 0.6.4, Outgoing queue is replaced with outgoing bundles manager,
and it showed nice memory improvement. Now I wanna start refactoring
of incoming queue.

My plan is that adding incoming bundles manager. Bundles can also
simply be written to local disk if when memory space is not enough.
So, incoming bundles manager can be performed a similar role of
DiskQueue and SpillingQueue in the future.

If you have any other opinion, Please let me know. If there are no
objections, I'll do it.

Best Regards, Edward J. Yoon
CEO at DataSayer Co., Ltd.

View raw message