hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-742) Implement of Hama RPC
Date Thu, 07 Mar 2013 17:37:22 GMT

    [ https://issues.apache.org/jira/browse/HAMA-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596098#comment-13596098
] 

Thomas Jungblut commented on HAMA-742:
--------------------------------------

Is there some rationale to keep the BSPMessageBundle? (besides backward compatibility for
combiners)

Not that I think they are evil or bad, but I think if we can keep the messaging as raw as
possible- we would benefit a lot in performance.
So for example if our serialization buffer is exceeding X-bytes (let's say 16mb as it should
satisfy the network bandwidth and neglect connection overheads) we are sending the number
of messages including in the 16mb in front of the messages, then X-times the raw bytes of
the messages.

Also we should stop sending classnames arround, I'm not sure if we should restrict the usage
of a single class for messaging or not. But it would definitely add less complexity and improve
messaging performance if we restrict the usage to a single class. This way you can serialize
the messages after each other without taking care of boundaries and just keep a single message
in memory that get's filled from disk in case of a spill.

So in the end, we are working with a plain Iterable<Message> instead of a bundle. The
usage should be transparent to the outside world as well, as we are just exposing simple queue
semantics (poll/size). size would be a constant time operations by summing over all received
"bundles" and poll is just the next() in the underlying Iterator.
                
> Implement of Hama RPC 
> ----------------------
>
>                 Key: HAMA-742
>                 URL: https://issues.apache.org/jira/browse/HAMA-742
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Edward J. Yoon
>             Fix For: 0.6.1
>
>
> To solve HDFS 2.0 compatibility issue, we have to change a lot of codes for Hadoop 2.0
RPC, moreover, yarn RPC doesn't support asynchronous call directly.
> Ultimately, we can pursue the performance and integrate more easily with hadoop multi-versions
by having our own RPC.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message