hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@googlemail.com>
Subject Re: Please review new APIs.
Date Wed, 02 Nov 2011 14:36:32 GMT
And what is the reason to implement our own Input/output format if you
stick with key/value pairs.
Let's be compatible to Hadoop and use theirs.

And we should really stop copying hadoop stuff arround. It is already there.

2011/11/2 Thomas Jungblut <thomas.jungblut@googlemail.com>

> Great :)
>
> Do you have plans to integrate a partitioning? Currently this is just a
> block assignment partitioning, hardcoded in the client.
> This won't be useful for PageRank and SSSP.
> This would help us in Graph package as well for the next release.
>
> 2011/11/2 Edward J. Yoon <edwardyoon@apache.org>
>
>> > For sure I agree we should allow the former programming model with no
>> input> without explicitly instantiating dummy inputs/splits. What about
>> providing> two basic (different) implementations?
>>
>> +1
>>
>> I was about to.
>> On Wed, Nov 2, 2011 at 9:23 PM, Tommaso Teofili
>> <tommaso.teofili@gmail.com> wrote:
>> > 2011/11/2 Thomas Jungblut <thomas.jungblut@googlemail.com>
>> >
>> >> Another point while fixing the local runner:
>> >>
>> >> Are we now input driven?
>> >> I see in the code that the user defined task number is overriden by the
>> >> number of splits.
>> >> Was this your intention? This will actually make realtime processing
>> with
>> >> no static input a real pain.
>> >> For example if you want a similar behaviour in Hadoop M/R you'll need
>> to
>> >> create dummy splits, and this is not what we should aim at.
>> >>
>> >> We could simply check if the user define the NullInputFormat or
>> nothing and
>> >> then use the number of tasks the user has configured.
>> >>
>> >
>> > For sure I agree we should allow the former programming model with no
>> input
>> > without explicitly instantiating dummy inputs/splits. What about
>> providing
>> > two basic (different) implementations?
>> > Tommaso
>> >
>> >
>> >>
>> >> 2011/11/2 Tommaso Teofili <tommaso.teofili@gmail.com>
>> >>
>> >> > 2011/11/2 Edward J. Yoon <edwardyoon@apache.org>
>> >> >
>> >> > > > I'm sure that not every job actually needs a cleanup or a
setup.
>> >> > >
>> >> > > You're right. Almost BSP applications should override bsp() method
>> >> > > but, setup() and cleaner() methods are not as you said. Let's
fix
>> >> > > them.
>> >> > >
>> >> >
>> >> > Agreed +1
>> >> >
>> >> >
>> >> > >
>> >> > > > Generally I would suggest to integrate the OutputCollector
and
>> the
>> >> > > > RecordReader into the BSPPeerImpl.
>> >> > > > So our peer is like the context in Hadoop.
>> >> > >
>> >> > > Good idea.
>> >> > >
>> >> >
>> >> > +1 here too
>> >> >
>> >> > Tommaso
>> >> >
>> >> >
>> >> > >
>> >> > > On Wed, Nov 2, 2011 at 9:03 PM, Thomas Jungblut
>> >> > > <thomas.jungblut@googlemail.com> wrote:
>> >> > > > Yes. When I reworked that API, I made a default implementation
>> in our
>> >> > > > abstract BSP class.
>> >> > > > So the user has to override the methods for himself, if he
needs
>> to.
>> >> > > > I'm sure that not every job actually needs a cleanup or a
setup.
>> >> > > >
>> >> > > > Generally I would suggest to integrate the OutputCollector
and
>> the
>> >> > > > RecordReader into the BSPPeerImpl.
>> >> > > > So our peer is like the context in Hadoop.
>> >> > > > But that is just a minor thing. It is a great improvement
;)
>> >> > > >
>> >> > > > 2011/11/2 Edward J. Yoon <edwardyoon@apache.org>
>> >> > > >
>> >> > > >> There're bsp(), setup() and cleaner() methods.
>> >> > > >>
>> >> > > >> What is you suggestion?
>> >> > > >>
>> >> > > >> On Wed, Nov 2, 2011 at 8:47 PM, Thomas Jungblut
>> >> > > >> <thomas.jungblut@googlemail.com> wrote:
>> >> > > >> > Have a look at the combiner class. I know that this
is just a
>> >> > "test",
>> >> > > but
>> >> > > >> > it is really messy if the user does not use the
methods, but
>> is
>> >> > > forced to
>> >> > > >> > override them.
>> >> > > >> >
>> >> > > >> > 2011/11/2 Edward J. Yoon <edwardyoon@apache.org>
>> >> > > >> >
>> >> > > >> >> Why?
>> >> > > >> >>
>> >> > > >> >> On Wed, Nov 2, 2011 at 8:21 PM, Thomas Jungblut
>> >> > > >> >> <thomas.jungblut@googlemail.com> wrote:
>> >> > > >> >> > I totally dislike that BSP class now has
abstract methods
>> >> instead
>> >> > > of
>> >> > > >> >> > default implementations.
>> >> > > >> >> >
>> >> > > >> >> > 2011/11/2 Edward J. Yoon <edwardyoon@apache.org>
>> >> > > >> >> >
>> >> > > >> >> >> Hi all,
>> >> > > >> >> >>
>> >> > > >> >> >> As you know, recently combiners and
IO are added.
>> >> > > >> >> >>
>> >> > > >> >> >> Please review them from user viewpoint.
>> >> > > >> >> >>
>> >> > > >> >> >>
>> >> > > >> >> >>
>> >> > > >> >>
>> >> > > >>
>> >> > >
>> >> >
>> >>
>> http://svn.apache.org/repos/asf/incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/PiEstimator.java
>> >> > > >> >> >>
>> >> > > >> >> >> I'm testing multiple tasks and IO features
on 100 nodes
>> >> cluster
>> >> > > using
>> >> > > >> >> >> 10 tasks per node. If there's no issue,
I'll close
>> HAMA-258.
>> >> > > >> >> >>
>> >> > > >> >> >> Thanks.
>> >> > > >> >> >>
>> >> > > >> >> >> --
>> >> > > >> >> >> Best Regards, Edward J. Yoon
>> >> > > >> >> >> @eddieyoon
>> >> > > >> >> >>
>> >> > > >> >> >
>> >> > > >> >> >
>> >> > > >> >> >
>> >> > > >> >> > --
>> >> > > >> >> > Thomas Jungblut
>> >> > > >> >> > Berlin <thomas.jungblut@gmail.com>
>> >> > > >> >> >
>> >> > > >> >>
>> >> > > >> >>
>> >> > > >> >>
>> >> > > >> >> --
>> >> > > >> >> Best Regards, Edward J. Yoon
>> >> > > >> >> @eddieyoon
>> >> > > >> >>
>> >> > > >> >
>> >> > > >> >
>> >> > > >> >
>> >> > > >> > --
>> >> > > >> > Thomas Jungblut
>> >> > > >> > Berlin <thomas.jungblut@gmail.com>
>> >> > > >> >
>> >> > > >>
>> >> > > >>
>> >> > > >>
>> >> > > >> --
>> >> > > >> Best Regards, Edward J. Yoon
>> >> > > >> @eddieyoon
>> >> > > >>
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > --
>> >> > > > Thomas Jungblut
>> >> > > > Berlin <thomas.jungblut@gmail.com>
>> >> > > >
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Best Regards, Edward J. Yoon
>> >> > > @eddieyoon
>> >> > >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Thomas Jungblut
>> >> Berlin <thomas.jungblut@gmail.com>
>> >>
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>
>
>
> --
> Thomas Jungblut
> Berlin <thomas.jungblut@gmail.com>
>



-- 
Thomas Jungblut
Berlin <thomas.jungblut@gmail.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message