bigtop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From RJ Nowling <rnowl...@gmail.com>
Subject Re: Gearing up for 0.9
Date Thu, 12 Mar 2015 18:31:50 GMT
LDA will be available in Spark 1.3.0 which should be released in a few
days.  (According to the Spark mailing list.)

https://issues.apache.org/jira/browse/SPARK-1405

This looks to be a long list of potential improvements coming along in time:
https://issues.apache.org/jira/browse/SPARK-5572

What sort of hardware do you have available for your work?

On Thu, Mar 12, 2015 at 12:39 PM, David Starina <david.starina@gmail.com>
wrote:

> Thank you for your suggestions, I am also considering Spark. Actually I
> was hoping I will be able to compare the speed of the Mahout's (MapReduce)
> and MLLib's (Spark) implementations of LDA algorithm, but am not sure
> whether the MLLib's implementation is already available in the current
> version. I hope I will at least be able to try one of the implementations.
> Anyway, don't want to spam your developer mailing list with this :-)
>
> --David
>
> On Thu, Mar 12, 2015 at 6:21 PM, Konstantin Boudnik <cos@apache.org>
> wrote:
>
>> And speaking from my former academic background - it never hurts if you
>> thesis
>> is sexy. And Spark is quite hot at the moment ;)
>>
>> Cos
>>
>> On Thu, Mar 12, 2015 at 01:15PM, jay vyas wrote:
>> > @David i like rj's idea on considering mllib, which is something which
>> is
>> > gauranteed to be bigtop supported !  possibly consider that as an
>> option if
>> > you want to build your thesis on bigtop
>> >
>> > On Thu, Mar 12, 2015 at 12:52 PM, Konstantin Boudnik <cos@apache.org>
>> wrote:
>> >
>> > > On Thu, Mar 12, 2015 at 06:04AM, jay vyas wrote:
>> > > > Hi david  !
>> > > >
>> > > > We found that mahout 0.9 , iirc, was released incompatible with
>> Yarn at
>> > > the
>> > > > time, and there wasn't any commandline option that you could run
>> when
>> > > > compiling which fixed that issue.  So that really made us realize
we
>> > > needed
>> > > > community to participate with us.
>> > > >
>> > > > 1) I've reached out to the mahout community, and maybe they will
>> join
>> > > > forces with us before it is dropped, but for us, we simply have too
>> many
>> > > > other priorities and nobody from the mahout community was
>> interested in
>> > > > collaborating with us on package testing  in bigtop... So much like
>> > > fedora,
>> > > > debian, and so , once the curators of the have no interest in
>> packaging
>> > > it,
>> > > > it becomes hard to keep in the distro.
>> > > >
>> > > > 2) Are you interested in maintaining mahout packaging in bigtop?
>> That
>> > > > might be a nice addition to your thesis .  It also would give you
>> some
>> > > > interesting insight into the libraries that mahout uses, and how it
>> uses
>> > > > hadoop APIs, etc... I'd be able to help you get up to speed with the
>> > > basics
>> > > > of building bigtop if you have that interest.
>> > > >
>> > > > 3) RE: W/o bigtop, you can always build/compile/install mahout from
>> > > source
>> > > > or from tarballs if need be.  however this tends to be an annoying
>> thing
>> > > to
>> > > > maintain and manually make sure it interoperates with your yarn
>> distro
>> > > > etc....
>> > >
>> > > Not to say that the same compatiblity issue between Hadoop 2.x and
>> Mahout
>> > > will
>> > > still be there when you build it yourself.
>> > >
>> > > Cos
>> > >
>> > > > On Thu, Mar 12, 2015 at 1:57 AM, David Starina <
>> david.starina@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Hi guys,
>> > > > >
>> > > > > I'm just an observer, a passer-by you might say (for now) of
this
>> > > mailing
>> > > > > list, so I hope you won't mind me commenting on this. I was
>> planning
>> > > to use
>> > > > > Hadoop with Mahout in my thesis, so this thread kind of freaked
>> me out.
>> > > > > Since you are mentioning the two pieces of software are
>> incompatible -
>> > > does
>> > > > > that mean it is not possible to get them to work together, or
just
>> > > that it
>> > > > > requires some extra effort? Also, there are some algorithms that
>> work
>> > > with
>> > > > > Spark - do you know whether those still work with recent versions
>> of
>> > > Spark?
>> > > > > Is there a lot of work to manually install Mahout without Bigtop?
>> > > > >
>> > > > > Anyhow, hope the Mahout guys find their focus again.
>> > > > >
>> > > > > Best regards,
>> > > > > David
>> > > > >
>> > > > >
>> > > > > On Thursday, March 12, 2015, jay vyas <
>> jayunit100.apache@gmail.com>
>> > > wrote:
>> > > > >
>> > > > >> okay, lets drop it... Im fine with that.
>> > > > >>
>> > > > >> On Wed, Mar 11, 2015 at 7:49 PM, Konstantin Boudnik <
>> cos@apache.org>
>> > > > >> wrote:
>> > > > >>
>> > > > >>> But the last time, back in 0.8, we found that runtime
is pretty
>> > > broken.
>> > > > >>> So, is
>> > > > >>> there any real reason to keep on pushing an incompatible
piece
>> of
>> > > > >>> software?
>> > > > >>>
>> > > > >>> Cos
>> > > > >>>
>> > > > >>> On Tue, Mar 10, 2015 at 09:42AM, jay vyas wrote:
>> > > > >>> >    At this point we can just keep packaging as is,
but if
>> bugs crop
>> > > > >>> up, drop
>> > > > >>> >    it unless we can get help
>> > > > >>> >    On Mon, Mar 9, 2015 at 11:49 PM, Konstantin Boudnik
<
>> > > cos@apache.org
>> > > > >>> >
>> > > > >>> >    wrote:
>> > > > >>> >
>> > > > >>> >      Should read
>> > > > >>> >
>> > > > >>> >      So, anyone is interested to maintain Mahout
OR a thing of
>> > > similar
>> > > > >>> >      nature?
>> > > > >>> >
>> > > > >>> >      Sorry
>> > > > >>> >      On Mon, Mar 09, 2015 at 08:45PM, Konstantin
Boudnik
>> wrote:
>> > > > >>> >      > So, anyone is interested to maintain Mahout
and a
>> thing of
>> > > > >>> similar
>> > > > >>> >      nature?
>> > > > >>> >
>> > > > >>> >      >
>> > > > >>> >      > Cos
>> > > > >>> >      >
>> > > > >>> >      > On Sat, Mar 07, 2015 at 02:13AM, Konstantin
Boudnik
>> wrote:
>> > > > >>> >      > > I think it eventually boils down
to who will be
>> > > maintaining
>> > > > >>> the
>> > > > >>> >      component.
>> > > > >>> >      > >
>> > > > >>> >      > > As Jay said - there's maintainer
for the component
>> and if
>> > > it
>> > > > >>> will
>> > > > >>> >      continue
>> > > > >>> >      > > like this we might have no choice
but delete it: I
>> think
>> > > > >>> right now
>> > > > >>> >      it blocks
>> > > > >>> >      > > the release.
>> > > > >>> >      > >
>> > > > >>> >      > > Cos
>> > > > >>> >      > >
>> > > > >>> >      > > On Fri, Mar 06, 2015 at 02:29PM,
Ed - 0x1b wrote:
>> > > > >>> >      > > > some links to some of Mahout's
replacements - not
>> all
>> > > Apache
>> > > > >>> >      projects.
>> > > > >>> >      > > >
>> > > > >>> >      > > >
>> > > > >>> >
>> > > > >>>
>> > >
>> https://gigaom.com/2014/03/27/apache-mahout-hadoops-original-machine-learning-project-is-moving-on-from-mapreduce/
>> > > > >>> >      > > > http://0xdata.com/
>> > > > >>> >      > > > https://spark.apache.org/mllib/
>> > > > >>> >      > > >
>> > > > >>> >
>> > > > >>>
>> > > https://databricks.com/blog/2014/06/30/sparkling-water-h20-spark.html
>> > > > >>> >      > > > https://github.com/apache/mahout/tree/master/h2o
>> > > > >>> >      > > >
>> > > > >>> >      > > > and
>> > > > >>> >      > > >
>> > > > >>> >      > > >
>> > > > >>> >
>> > > > >>>
>> > >
>> https://gigaom.com/2014/02/28/cloudera-is-rebuilding-machine-learning-for-hadoop-with-oryx/
>> > > > >>> >      > > >
>> > > > >>> >      > > > On Fri, Mar 6, 2015 at 12:47
PM, Konstantin Boudnik
>> > > > >>> >      <cos@apache.org> wrote:
>> > > > >>> >      > > > > Thanks man! I've heard
that there's a new
>> project that
>> > > > >>> picks up
>> > > > >>> >      where Mahout
>> > > > >>> >      > > > > left of wrt Hadoop2.x support.
But might be I am
>> just
>> > > > >>> delusional
>> > > > >>> >      from
>> > > > >>> >      > > > > hunger...?
>> > > > >>> >      > > > >
>> > > > >>> >      > > > > On Fri, Mar 06, 2015 at
02:32PM, jay vyas wrote:
>> > > > >>> >      > > > >>A  A  i sent a email
to mahout-dev... maybe
>> someone
>> > > will
>> > > > >>> ping
>> > > > >>> >      back :)
>> > > > >>> >      > > > >>A  A  On Fri, Mar 6,
2015 at 2:25 PM, Jay Vyas
>> > > > >>> >      <jayunit100.apache@gmail.com>
>> > > > >>> >      > > > >>A  A  wrote:
>> > > > >>> >      > > > >>
>> > > > >>> >      > > > >>A  A  A  Iirc we don't
have any maintainers for
>> it.
>> > > > >>> >      > > > >>A  A  A  Is anyone interested
in maintaining it?
>> > > > >>> >      > > > >>A  A  A  > On Mar
6, 2015, at 2:23 PM, Konstantin
>> > > Boudnik
>> > > > >>> >      <cos@apache.org> wrote:
>> > > > >>> >      > > > >>A  A  A  >
>> > > > >>> >      > > > >>A  A  A  > Does anyone
know what's the story with
>> > > Mahout?
>> > > > >>> Has it
>> > > > >>> >      been fixed to be
>> > > > >>> >      > > > >>A  A  A  working
>> > > > >>> >      > > > >>A  A  A  > with Hadoop2
or shall we remove it
>> from the
>> > > > >>> BOM?
>> > > > >>> >      > > > >>A  A  A  >
>> > > > >>> >      > > > >>A  A  A  > Cos
>> > > > >>> >      > > > >>A  A  A  >
>> > > > >>> >      > > > >>A  A  A  >> On
Sat, Feb 28, 2015 at 06:56PM,
>> > > Konstantin
>> > > > >>> Boudnik
>> > > > >>> >      wrote:
>> > > > >>> >      > > > >>A  A  A  >> Guys,
>> > > > >>> >      > > > >>A  A  A  >>
>> > > > >>> >      > > > >>A  A  A  >> It'd
be great if we can have the next
>> > > release
>> > > > >>> ready
>> > > > >>> >      by ApacheCon in
>> > > > >>> >      > > > >>A  A  A  April.
>> > > > >>> >      > > > >>A  A  A  >> Think
about all the PR and publicity
>> we
>> > > can
>> > > > >>> get
>> > > > >>> >      without any effort on
>> > > > >>> >      > > > >>A  A  A  our own.
>> > > > >>> >      > > > >>A  A  A  >> And
perhaps from the tactical
>> standpoint
>> > > we
>> > > > >>> shall
>> > > > >>> >      call this release
>> > > > >>> >      > > > >>A  A  A  1.0?
>> > > > >>> >      > > > >>A  A  A  >>
>> > > > >>> >      > > > >>A  A  A  >> I
believe the only major hurdle
>> between us
>> > > > >>> and the
>> > > > >>> >      release is CI.
>> > > > >>> >      > > > >>A  A  A  Roman, I
>> > > > >>> >      > > > >>A  A  A  >> understand
you're busy elsewhere, but
>> > > could
>> > > > >>> you
>> > > > >>> >      please let us know
>> > > > >>> >      > > > >>A  A  A  what else
>> > > > >>> >      > > > >>A  A  A  >> needs
to be done before we can start
>> > > doing the
>> > > > >>> >      regular builds and how
>> > > > >>> >      > > > >>A  A  A  the
>> > > > >>> >      > > > >>A  A  A  >> community
can help. That's the
>> highest
>> > > > >>> priority,
>> > > > >>> >      IMO.
>> > > > >>> >      > > > >>A  A  A  >>
>> > > > >>> >      > > > >>A  A  A  >> There
a couple of the tickets left
>> > > > >>> >      unfixed/unassigned on BIGTOP-1480,
>> > > > >>> >      > > > >>A  A  A  and if
>> > > > >>> >      > > > >>A  A  A  >> they
aren't resolved on time we can
>> move
>> > > them
>> > > > >>> >      farther. There's lesser
>> > > > >>> >      > > > >>A  A  A  than a
>> > > > >>> >      > > > >>A  A  A  >> half-dozen
blockers and none of them
>> look
>> > > too
>> > > > >>> big,
>> > > > >>> >      honestly. And we
>> > > > >>> >      > > > >>A  A  A  have a
>> > > > >>> >      > > > >>A  A  A  >> whole
lot of active committers and
>> > > > >>> contributors to
>> > > > >>> >      wrap-up the
>> > > > >>> >      > > > >>A  A  A  release in
a
>> > > > >>> >      > > > >>A  A  A  >> couple
of weeks.
>> > > > >>> >      > > > >>A  A  A  >>
>> > > > >>> >      > > > >>A  A  A  >> Do
we want to try upgrade to HBase
>> 1.x for
>> > > > >>> this
>> > > > >>> >      release or it might
>> > > > >>> >      > > > >>A  A  A  be too big
>> > > > >>> >      > > > >>A  A  A  >> of
a distortion? Andrew, what do you
>> think
>> > > > >>> and do
>> > > > >>> >      you have cycles to
>> > > > >>> >      > > > >>A  A  A  do that?
>> > > > >>> >      > > > >>A  A  A  >>
>> > > > >>> >      > > > >>A  A  A  >> What
else we need to get done for
>> this
>> > > > >>> release?
>> > > > >>> >      Suggestions?
>> > > > >>> >      > > > >>A  A  A  >>
>> > > > >>> >      > > > >>A  A  A  >> Is
there anyone who wants to step up
>> as
>> > > the
>> > > > >>> RM this
>> > > > >>> >      time around? RM
>> > > > >>> >      > > > >>A  A  A  doesn't
>> > > > >>> >      > > > >>A  A  A  >> mean
that you have to do all the
>> job, but
>> > > > >>> rather be
>> > > > >>> >      an efficient with
>> > > > >>> >      > > > >>A  A  A  a stick ;)
>> > > > >>> >      > > > >>A  A  A  >>
>> > > > >>> >      > > > >>A  A  A  >> Thoughts?
>> > > > >>> >      > > > >>A  A  A  >>AA
 Cos
>> > > > >>> >      > > > >>
>> > > > >>> >      > > > >>A  A  --
>> > > > >>> >      > > > >>A  A  jay vyas
>> > > > >>> >
>> > > > >>> >    --
>> > > > >>> >    jay vyas
>> > > > >>>
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> --
>> > > > >> jay vyas
>> > > > >>
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > jay vyas
>> > >
>> > >
>> >
>> >
>> > --
>> > jay vyas
>>
>>
>

Mime
View raw message