hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Current status and problems of BSP core
Date Mon, 13 May 2013 08:23:21 GMT
I don't dislike creating new branches, but I'm afraid because I
already saw some emotional comments like "Don't ruin the current
trunk. Someone's working on there". If we agree on our roadmap
described in Wiki, I hope you can focus more on discussing/reviewing
them than working your own parts. In case of we decide to use netty or
something instead of Hadoop RPC (HAMA-742 in 0.6.3 roadmap), it's not
a big task, but I guess many core classes will be a little changed.

I was tested 0.6.1 many times on my cluster but bugs were reported
from MRQL developer or users. I think this is good signal. We might be
able to create a continuous cycle of feedback and improvement in near
future.


On Mon, May 13, 2013 at 3:25 PM, Tommaso Teofili
<tommaso.teofili@gmail.com> wrote:
> about the roadmap: for what I can see we usually agree on the roadmap, or
> at least when there's no clear statement of the contrary you can assume
> lazy consensus on your proposal if no one says anything else in 2-3 days.
>
> about branching: as far as I know we're all happy to work on trunk (my
> former branching for testing with Apache DirectMemory was created just to
> avoid blocking others as I was unsure about how much time I could spend on
> it).
>
> about tech / partitioning stuff: I understood Ed has some ideas in mind
> about how to improve partitioning but Suraj's was concerned about the type
> of inputs could be supported then; I may be wrong but that was my
> understanding. Can we try to share a patch to review and agree on on the
> specific Jira issue?
>
> My 2 cents,
> Regards,
> Tommaso
>
>
>
>
> 2013/5/13 Edward J. Yoon <edwardyoon@apache.org>
>
>> The blocker is a disagreement among small PMCers. I never seen the
>> productive discussion about input partitioning, during discuss about
>> input partitioning. VertexInputReader, DiskVerticesInfo, and
>> SpillingQueue were always in there. Hence, I still don't know whether
>> you understood or not.
>>
>> To be blunt, you have no opinion on plans of 0.6.1 and 0.6.2 roadmap,
>> and you didn't voted on 0.6.1 and furthermore I felt that you want to
>> create your own branch. Is this a tacit objection, or
>> mis-understanding, or gesture of defiance?
>>
>> On Sun, May 12, 2013 at 10:47 PM, Suraj Menon <menonsuraj5@gmail.com>
>> wrote:
>> > We've had discussions on the same many times.
>> >
>> > "But please don't block other developments" - I want to understand where
>> > the development is blocked especially for partitioning.
>> >
>> > -Suraj
>> >
>> >
>> > On Sun, May 12, 2013 at 6:54 AM, Edward J. Yoon <edwardyoon@apache.org
>> >wrote:
>> >
>> >> Hi dev (especially BSP core committers and PMCers),
>> >>
>> >> First of all, the input re-partitioning is very important and
>> >> unavoidable part of Apache Hama. Since there are still people who say
>> >> "as if everything can be settled by Spilling Queue with something" or
>> >> "It should be also able to solve for the large input without large
>> >> cluster", let me explain again.
>> >>
>> >> Restricting the number of Task processors to the number of block files
>> >> of input, means that both below situations are problematic:
>> >>
>> >> Case 1. User want to process 1GB input with 1,000 tasks on large
>> cluster.
>> >> Case 2. User want to process 10GB input with 3 tasks on small cluster.
>> >>
>> >> I believe this part has higher priority than other issues, such as
>> >> VertexInputReader, Spilling Queue. Hence, please don't mix everything
>> >> here, when we talking about this in the future. To re-partitioning raw
>> >> data and create partitions as desired, currently we have a
>> >> PartitioningJobRunner. So, before working on future projects, please
>> >> test with various scenarios, for example, whether it works well with
>> >> compressed files, latest Hadoop (HDFS 2.0), or on large cluster.
>> >>
>> >> Second, is a lack of active discussion on RoadMap, and a difference of
>> >> opinion on release. There's a limit as to what we can do. Moreover, as
>> >> I mentioned above, there're many high priority issues. I don't
>> >> understand why you need to develop BSP core or create separate
>> >> branches without working together on basis issues.
>> >>
>> >> Of course, research tasks are fine. If you want to work on them in
>> >> your free time, then feel free to do so. But please don't block other
>> >> developments.
>> >>
>> >> I hope you understand my meaning.
>> >> Thanks.
>> >>
>> >> --
>> >> Best Regards, Edward J. Yoon
>> >> @eddieyoon
>> >>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message