incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@googlemail.com>
Subject Re: Messaging Interface
Date Sat, 04 Feb 2012 18:28:36 GMT
Hi,

we sorted things out why there is keyin and keyout there on chat.
However I really like your two ideas:
- generic message types (to prevent casting, at least at the users side)
- possible sort of incoming messages

I see them integrated in two ways:
First the "Generic messages":
HAMA-503 <https://issues.apache.org/jira/browse/HAMA-503> is going to add
us a new kind of writing a BSP. I'm pretty sure that this will be on top of
the BSP Class.
I can think that a computation unit will have this kind of <MESSAGEIN,
MESSAGEOUT> interface that will gurantee typesafetyness at the user-api
level.
Internally this can be accomplished by casting the underlying writable.
Maybe I can get a prototype over the next week, so you can have a look and
tell me what you think.

Second the "Sort of messages":
This is a fancy feature, consider SSSP and you just get the messages in
ascending order by cost. This would save a lot of looping ;)
However this has overhead and needs the Comparable interface. I see this
integrated in the MessageService.
In my opinion, especially when we start adding more RPC protocols we have
to make an abstract subclass and a more pluggable solution to support these
kind of mechanisms.

I keep both things in the back of my mind. Suraj, if you'd like the second
idea, please file a new Jira. I think it is a great idea.

2012/2/4 Suraj Menon <menonsuraj5@gmail.com>

> Hi, I like this idea. But I want to explore one more step backwards on this
> :). I want to know what purpose does the restriction of having KEYIN and
> KEYOUT serve for user here? I feel they could be a field in user’s message
> class than Hama suggesting it to be there. We already have KeyValuePair
> defined. But the user’s bsp module should be able to send message with no
> keys.
>
> In Map-Reduce, this serves as a part of the programming model, where the
> entries read are aggregated based on keys and then in reduce again we
> process sorted records per key. But, in BSP model, it is the destination of
> each message that regulates where a particular piece of data is processed
> in the next superstep. Hence logically, the key on the message is the
> identity of destination peer(or group of peers). So why do we need KEYIN
> and KEYOUT? How is it different when the input and output format is
> expressed as KeyValuePair.
>
> Let’s consider an example where a user has written a BSP class named
> MyCoolClass that passes messages of type MyCoolMessage (extends
> BSPMessage).
>
> Today he would have to write the bsp function as :
> MyCoolClass extends BSP<in_tag_type, in_msg_type, out_msg_tag,
> out_msg_type>{
>
> bsp(peer<*in_tag_type, in_msg_type, out_key_type, out_msg_type>* ){
>
> }
>
>
> MyCoolClass extends BSP<in_msg_type, out_msg_type>{
>
> bsp(peer<? super Writable in_msg_type, ? extends Writable out_msg_type){
>
>
>
> }
> }
>
> There are other scenarios to consider too. What if a user wants the
> messages sent to his BSPPeer sorted. I think we should provide this flavor.
> **
>
> bsp(peer<? super WritableComparable in_msg_type, ? extends
> WritableComparable out_msg_type)
>
>
> and Hama framework should support this.
>
> If the aforesaid doesn’t make sense please help in getting correct
> understanding.  :)
> *
> *
> *Thanks,*
> *Suraj*
>
> On Fri, Feb 3, 2012 at 8:52 AM, Tommaso Teofili
> <tommaso.teofili@gmail.com>wrote:
>
> > +1, nice API improvement.
> > Tommaso
> >
> > 2012/2/3 Thomas Jungblut <thomas.jungblut@googlemail.com>
> >
> > > Yes, this sounds to me reasonable as well.
> > > Other opinions? Otherwise I am filing a jira for that.
> > >
> > > 2012/2/3 Edward J. Yoon <edwardyoon@apache.org>
> > >
> > > > I think, we may want to change like <? extends Writable, ? extends
> > > > Writable>.
> > > >
> > > > On Fri, Feb 3, 2012 at 9:45 AM, Edward J. Yoon <
> edwardyoon@apache.org>
> > > > wrote:
> > > > > I prefer the Writable.
> > > > >
> > > > > On Thu, Feb 2, 2012 at 8:49 PM, Thomas Jungblut
> > > > > <thomas.jungblut@googlemail.com> wrote:
> > > > >> Hi all,
> > > > >>
> > > > >> I refactored the messaging in 0.3.0 and changed this from an
> > inteface
> > > > to an
> > > > >> abstract base class.
> > > > >> Currently it is fine, but I feel that the user is too restricted
> in
> > > > using
> > > > >> messages.
> > > > >> You have this strict structure of tag and data. I think we should
> > > widen
> > > > the
> > > > >> messages to just Messagable .
> > > > >> If we want to have the freedom to add additional things, we should
> > > > extend
> > > > >> Messagable from Writable and use this for it.
> > > > >>
> > > > >> So send may look like this:
> > > > >>
> > > > >> public final void send(String peerName, Messagable msg)
> > > > >>
> > > > >>
> > > > >> and getCurrentMessage:
> > > > >>
> > > > >>  public final Messagable getCurrentMessage()
> > > > >>
> > > > >>
> > > > >> However, I am not really happy that we return Messagable (requires
> > > > casting
> > > > >> and stuff).
> > > > >> For the usecases of specific tagging we can add the getTag()
> method
> > to
> > > > the
> > > > >> Messagable interface.
> > > > >> What type should this be then? I mean, String would be quite
a
> large
> > > > >> overhead. Integer might not be useful.
> > > > >>
> > > > >> Or should we widen this to Writable instead? So you can send
> things
> > > > you've
> > > > >> read from sequencefiles directly to other tasks.
> > > > >>
> > > > >> What do you think? I am still not aware of how it should look
> like.
> > Or
> > > > are
> > > > >> you satisfied with the current messaging?
> > > > >>
> > > > >> --
> > > > >> Thomas Jungblut
> > > > >> Berlin <thomas.jungblut@gmail.com>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Edward J. Yoon
> > > > > @eddieyoon
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards, Edward J. Yoon
> > > > @eddieyoon
> > > >
> > >
> > >
> > >
> > > --
> > > Thomas Jungblut
> > > Berlin <thomas.jungblut@gmail.com>
> > >
> >
>



-- 
Thomas Jungblut
Berlin <thomas.jungblut@gmail.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message