mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: A theme to work
Date Wed, 27 Nov 2013 18:17:40 GMT
On Wed, Nov 27, 2013 at 9:09 AM, Oleksandr Olgashko <
alexandrolgash@gmail.com> wrote:

> Could you please formalize reqs for ICA? I mean, what actually should be
> done.
> Parallelization strategy is a bit general concept.
>

No, it is not really. Not general enough so that you couldn't do it on your
own.

You can think of it as a fairly free-style TDD for how to do  things on MR
or Pregel so the majority of reviewers here could understand.

Not ideal example but hope it helps --look at the attachment for
https://issues.apache.org/jira/browse/MAHOUT-1365

-d


>
> 2013/11/26 Dmitriy Lyubimov <dlieu.7@gmail.com>
>
> > On Tue, Nov 26, 2013 at 1:11 PM, Олександр Ольгашко <
> > alexandrolgash@gmail.com> wrote:
> >
> > > I may need unknown period of time to get familiar with Mahout project
> > > structure.
> > > I'd like to make some research about ICA's parallelization strategy, it
> > is
> > > quite interesting.
> > > Not sure, if i can help somehow with MAHOUT-1346, never worked with
> such
> > > things before.
> > >
> > > Should i use mail lists or smth else for arising questions and other
> > > communication?
> > >
> > yes. there's probably no better place as far as Mahout is concerned.
> >
> > >
> > >
> > > 2013/11/26 Dmitriy Lyubimov <dlieu.7@gmail.com>
> > >
> > > > Dimension reduction is addressed with PCA which is an option of SSVD
> > > > method.
> > > > However, if you can research/offer parallelization strategy for ICA,
> > i'd
> > > be
> > > > all ears.
> > > >
> > > > there's also ongoing push to create a DSL environment for mahout
> > > > distributed matrices to Spark which i personally think is one of the
> > most
> > > > promising recent developments. It is also an treasure chest (or a can
> > of
> > > > worms depending on how you view it) for new people to chime in. DSL
> > > > environment issue is MAHOUT-1346, with everything else pretty much
> > > > dependent on it
> > > >
> > > > -d
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Nov 26, 2013 at 9:19 AM, Олександр Ольгашко <
> > > > alexandrolgash@gmail.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I am a student, interested in data analysis, also i have chosen
> this
> > > > theme
> > > > > for my diploma work. As mentioned here
> > > > > https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms,
> there
> > > are
> > > > > some open algorithms, for example, in Dimension reduction section.
> > > > >
> > > > > So, how can i start develop them? I have some theoretical
> background,
> > > > but i
> > > > > think, there may be some unknown problems. Mb somebody is working
> on
> > > > these
> > > > > algorithms. Can you give some tips for start?
> > > > >
> > > > > I searched in JIRA for Independent Component Analysis, found
> nothing.
> > > > >
> > > > > Thanks in advance.
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message