hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Remove Spilling Queue and rewrite checkpoint/recovery
Date Mon, 18 Aug 2014 09:40:33 GMT
Do you have any plan for merging them?

This is side opinion. If we want to use Git, now I'm +1.

On Sat, Aug 16, 2014 at 12:00 AM, Chia-Hung Lin <clin4j@googlemail.com> wrote:
> Code right now is at https://github.com/chlin501/hama.git
>
> Maven and jdk are required to build the project
>
> Command to have a clean build:
> mvn clean install -DskipTests=true -Dmaven.javadoc.skip=true
>
> To test a specific test case:
> mvn -DskipTests=false -Dtest=<TestCaseName> test
>
>
> On 15 August 2014 18:21, Suraj Menon <menonsuraj5@gmail.com> wrote:
>> Hi Edward, sorry to enter the discussion so late.
>>
>> Bundling and Unbundling of message queue is not Spilling queue's
>> responsibility, it was ended up there to be compatible with the existent
>> implementation of BSP Peer communication. Remember Spilling Queue
>> implementation was done to immediately remove some OutOfMemory issues on
>> sender side first. Spilling Queue gives you a byte array (ByteBuffer) with
>> a batch of serialized messages.  This is effectively bundling the messages
>> in byte array (hence the ByteArrayMessageBundle) and sending them for
>> processing. The SpilledDataProcessor's are implemented as a pipeline of
>> processing done using inheritance, something like what we may use trait for
>> in Scala. So if we have a SpilledDataProcessor that sends this bundled
>> message via RPC to the peer, there is no need to write them to file and
>> read them back. As I previously mentioned this was done to be compatible
>> with the existent implementation of peer.send.
>>
>> Also, the async checkpoint recovery code was written before spilling queue.
>> Today we can remove the single message write and do this in "before peer
>> sync" phase to just write the whole file to HDFS.
>>
>> I would say performance numbers and maintainability comes first and if you
>> think removing spilling queue is a solution go for it. As far as async
>> checkpointing is to be considered, that was a first proof of concept we did
>> and it is high time we move forward from there.
>>
>> Chiahung, do you have some instruction on where and how I can build the
>> scala version of your code?
>>
>> I am really finding it hard to dedicate time for Hama these days.
>>
>> - Suraj
>>
>>
>> On Tue, Aug 12, 2014 at 7:15 AM, Edward J. Yoon <edwardyoon@apache.org>
>> wrote:
>>
>>> ChiaHung,
>>>
>>> Yes, I'm thinking similar things.
>>>
>>> On Tue, Aug 12, 2014 at 4:11 PM, Chia-Hung Lin <clin4j@googlemail.com>
>>> wrote:
>>> > I am currently working on this part based on the superstep api,
>>> > similar to the Superstep.java in the trunk.
>>> >
>>> > The checkpointer[1] saves bundle message instead of single message.
>>> > Not very sure if this is what you are looking for?
>>> >
>>> > [1].
>>> https://github.com/chlin501/hama/blob/peer-comm-mech-changed/core/src/main/scala/org/apache/hama/monitor/Checkpointer.scala
>>> >
>>> >
>>> >
>>> >
>>> > On 12 August 2014 15:04, Edward J. Yoon <edwardyoon@apache.org> wrote:
>>> >> I think that transferring single messages at a time is not a wise way.
>>> >> Bundle is used to avoid network overheads and contentions. So, if we
>>> >> use Bundle, each processor always sends/receives an bundles.
>>> >>
>>> >> BSPMessageBundle is Writable (and Iterable). And it manages the
>>> >> serialized message as a byte array. If we write an bundles when
>>> >> checkpointing or using Disk-queue, it'll be more simple and faster.
>>> >>
>>> >> In Spilling Queue case, it always requires the process of unbundling
>>> >> and putting messages into queue.
>>> >>
>>> >>
>>> >> On Tue, Aug 12, 2014 at 2:41 PM, Tommaso Teofili
>>> >> <tommaso.teofili@gmail.com> wrote:
>>> >>> -1, can't we first discuss? Also it'd be helpful to be more specific
>>> on the
>>> >>> problems.
>>> >>> Tommaso
>>> >>>
>>> >>>
>>> >>>
>>> >>> 2014-08-12 4:25 GMT+02:00 Edward J. Yoon <edwardyoon@apache.org>:
>>> >>>
>>> >>>> All,
>>> >>>>
>>> >>>> I'll delete Spilling queue, and rewrite checkpoint/recovery
>>> >>>> implementation (checkpointing bundles is better than checkpointing
all
>>> >>>> messages). Current implementation is quite mess :/ there are
huge
>>> >>>> deserialization/serialization overheads..
>>> >>>>
>>> >>>> --
>>> >>>> Best Regards, Edward J. Yoon
>>> >>>> CEO at DataSayer Co., Ltd.
>>> >>>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Best Regards, Edward J. Yoon
>>> >> CEO at DataSayer Co., Ltd.
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> CEO at DataSayer Co., Ltd.
>>>



-- 
Best Regards, Edward J. Yoon
CEO at DataSayer Co., Ltd.

Mime
View raw message