hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Re: Current status and problems of BSP core
Date Mon, 13 May 2013 06:25:50 GMT
about the roadmap: for what I can see we usually agree on the roadmap, or
at least when there's no clear statement of the contrary you can assume
lazy consensus on your proposal if no one says anything else in 2-3 days.

about branching: as far as I know we're all happy to work on trunk (my
former branching for testing with Apache DirectMemory was created just to
avoid blocking others as I was unsure about how much time I could spend on
it).

about tech / partitioning stuff: I understood Ed has some ideas in mind
about how to improve partitioning but Suraj's was concerned about the type
of inputs could be supported then; I may be wrong but that was my
understanding. Can we try to share a patch to review and agree on on the
specific Jira issue?

My 2 cents,
Regards,
Tommaso




2013/5/13 Edward J. Yoon <edwardyoon@apache.org>

> The blocker is a disagreement among small PMCers. I never seen the
> productive discussion about input partitioning, during discuss about
> input partitioning. VertexInputReader, DiskVerticesInfo, and
> SpillingQueue were always in there. Hence, I still don't know whether
> you understood or not.
>
> To be blunt, you have no opinion on plans of 0.6.1 and 0.6.2 roadmap,
> and you didn't voted on 0.6.1 and furthermore I felt that you want to
> create your own branch. Is this a tacit objection, or
> mis-understanding, or gesture of defiance?
>
> On Sun, May 12, 2013 at 10:47 PM, Suraj Menon <menonsuraj5@gmail.com>
> wrote:
> > We've had discussions on the same many times.
> >
> > "But please don't block other developments" - I want to understand where
> > the development is blocked especially for partitioning.
> >
> > -Suraj
> >
> >
> > On Sun, May 12, 2013 at 6:54 AM, Edward J. Yoon <edwardyoon@apache.org
> >wrote:
> >
> >> Hi dev (especially BSP core committers and PMCers),
> >>
> >> First of all, the input re-partitioning is very important and
> >> unavoidable part of Apache Hama. Since there are still people who say
> >> "as if everything can be settled by Spilling Queue with something" or
> >> "It should be also able to solve for the large input without large
> >> cluster", let me explain again.
> >>
> >> Restricting the number of Task processors to the number of block files
> >> of input, means that both below situations are problematic:
> >>
> >> Case 1. User want to process 1GB input with 1,000 tasks on large
> cluster.
> >> Case 2. User want to process 10GB input with 3 tasks on small cluster.
> >>
> >> I believe this part has higher priority than other issues, such as
> >> VertexInputReader, Spilling Queue. Hence, please don't mix everything
> >> here, when we talking about this in the future. To re-partitioning raw
> >> data and create partitions as desired, currently we have a
> >> PartitioningJobRunner. So, before working on future projects, please
> >> test with various scenarios, for example, whether it works well with
> >> compressed files, latest Hadoop (HDFS 2.0), or on large cluster.
> >>
> >> Second, is a lack of active discussion on RoadMap, and a difference of
> >> opinion on release. There's a limit as to what we can do. Moreover, as
> >> I mentioned above, there're many high priority issues. I don't
> >> understand why you need to develop BSP core or create separate
> >> branches without working together on basis issues.
> >>
> >> Of course, research tasks are fine. If you want to work on them in
> >> your free time, then feel free to do so. But please don't block other
> >> developments.
> >>
> >> I hope you understand my meaning.
> >> Thanks.
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >> @eddieyoon
> >>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message