mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: stochastic nature
Date Tue, 03 May 2016 00:59:58 GMT
also, mahout does have optimizer that simply decides on degree of
parallelism of the _product_. I.e., if it computes C=A'B then it figures
that final results should be split N ways. but it doesn't apply the
partition function -- it just uses the usual hash partitioner to forward
the keys, i don't think we ever override that.

On Mon, May 2, 2016 at 9:39 AM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> by probabilistic algorithms i mostly mean inference involving monte carlo
> type mechanisms (Gibbs sampling LDA which i think might still be part of
> our MR collection might be an example, as well as its faster counterpart,
> variational Bayes inference.
>
> the parallelization strategies are are just standard spark mechanisms (in
> case of spark), mostly are using their standard hash samplers (which are in
> math speak are uniform multinomial samplers really).
>
> On Mon, May 2, 2016 at 9:25 AM, Khurrum Nasim <khurrum.nasim@useitc.com>
> wrote:
>
>> Hey Dimitri -
>>
>> Yes I meant probabilistic algorithms.  If mahout doesn’t use
>> probabilistic algos then how does it accomplish a degree of optimal
>> parallelization ? Wouldn’t you need randomization to spread out the
>> processing of tasks.
>>
>> > On May 2, 2016, at 12:13 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
>> wrote:
>> >
>> > yes mahout has stochastic svd and pca which are described at length in
>> the
>> > samsara book. The book examples in Andrew Palumbo's github also contain
>> an
>> > example of computing k-means|| sketch.
>> >
>> > if you mean _probabilistic_ algorithms, although i have done some things
>> > outside the public domain, nothing has been contributed.
>> >
>> > You are very welcome to try something if you don't have big constraints
>> on
>> > oss contribution.
>> >
>> > -d
>> >
>> > On Mon, May 2, 2016 at 7:49 AM, Khurrum Nasim <khurrum.nasim@useitc.com
>> >
>> > wrote:
>> >
>> >> Hey All,
>> >>
>> >> I’d like to know if Mahout uses any randomized algorithms.   I’m
>> thinking
>> >> it probably does.  Can somebody point me to the packages that utilized
>> >> randomized algos.
>> >>
>> >> Thanks,
>> >>
>> >> Khurrum
>> >>
>> >>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message