mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prakash Poudyal <prakashpoud...@gmail.com>
Subject Re: About reuters-fkmeans-centroids
Date Thu, 28 Apr 2016 21:43:32 GMT
Dear Dmitriy,

I really appreciate you as you write so long to clarify my confusion. Much
appreciated. Thank you so much :)

Regards
Prakash Poudyal

On Thu, Apr 28, 2016 at 10:13 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
wrote:

> Prakash,
>
> (1) to be clear, the ASF trademark and branding policy is not to endorse
> views of the 3rd party publications and to ask 3rd party writers to do a
> disclosure that their views are not endorsed by ASF project. To that end,
> ASF project can't really tell you that some publication is
> "(in)appropriate". 3rd party publications are of their own account and
> cannot be by default tied to the ASF views. That said, committers have
> their opinions, which of course exhibit certain variation, and some things
> do get linked on the site or mentioned on Twitter via Mahout account. But
> some do not. Best practice is always to ask for pointers on the list first.
>
> (2) I am not sure what your definition of "appropriate" is, but on
> personal note, most of these links were quite "appropriate" at the time in
> the sense that they were published prior to release 0.10 and 2/2014 or
> before 0.10,  and therefore were describing what was in the project at that
> time. Thus, MIA fuzzy k-means example in your very link is dated back of
> June 2011 and is relevant to release circa 0.6 or 0.7. So if you mean
> whether those algorithms were "in the fold" back then, the answer is yes,
> they were. I see no contradiction between these publications and the
> current reality.
>
> (3) If something deprecated reasonably works for a particular purpose, I
> think there's no reason not to use/write about it.
>
> *However, I just don't think most of these particular deprecated
> Java-based MR algorithms work for the purposes of an established benchmark
> or a standard in a research -- modern edgy ML is usually much more faster
> (and often, more convenient too). *
>
> Don't mean to come across as preachy, but research is usually held to
> quite different standard as it comes to claims, than an ad-hoc industrial
> application or a blog entry. I simply can't see how any of MR stuff can
> work for that purpose today.
>
> (4) if your "appropriate"-ness question is really about why they were
> deprecated, well, there are two main reasons for that. First, it seems that
> the realization of MR limitations w.r.t. iterative applications quickly
> caught up with both users and contributors, and, second, most contributors
> abandoned their MR contributions (most likely for the same reason). I
> contributed a couple of MR algorithms back in 2010-2011 but i am absolutely
> fine with them being deprecated and written off the books. If something is
> not being used, or people (exactly as your case has demonstrated) don't get
> answers to their questions, or bugs are not being fixed, it is difficult to
> justify keeping the code. It is much easier to focus on what is actually
> being used and maintained instead. Here, the very banal and boring reason
> for the deprecations.
>
> (5) Finally, If your goal is simply to learn "how the project works", just
> like Suneel said, i'd suggest to follow release notes and the project site
> (news and howtos) -- your last link in fact should perhaps be your first.
> And the list, of coure.
>
> As you probably can tell by release notes, the last two years were
> practically exclusively about multiplatform Mahout involvement with Spark,
> Flink and H20 backends, as well as the Samsara environment for general
> numeric analysis (but no MR stuff beyond very nominal fixes).
>
> I also agree that it looks like the Mahout site perhaps should be more
> clear about the status of MR algorithms (it used to be more clear, I think,
> but every news eventually becomes an old news).
>
> Hope this clarifies.
>
> -d
>
> On Thu, Apr 28, 2016 at 12:02 PM, Prakash Poudyal <
> prakashpoudyal@gmail.com> wrote:
>
>> Hi!
>>
>> Thank you for your emails !!
>>
>> Actually, I  need to use fuzzy clustering to cluster the sentence in my
>> research. This is my goal.
>>
>> I started to use Fuzzy K means clustering of Mahout since last week !!! I
>> found several blogs links, and many other helpful documents !!!! I was
>> going through, as being new, I realize this the best, easy and fast way to
>> know about Mahout works. In my opinion, many new commers do the same as I
>> do. After being used to the tools, than only people focus on the works and
>> go deeply.
>>
>> I had gone through many blogs and sites to know about Mahout, some of
>> them are below :
>>
>> http://technobium.com/introduction-to-clustering-using-apache-mahout/
>>
>> http://tuxdna.github.io/pages/mahout.html
>>
>>
>> https://github.com/tdunning/MiA/blob/master/src/main/java/mia/clustering/ch09/FuzzyKMeansExample.java
>>
>> http://www.programering.com/a/MDNwgTMwATI.html
>>
>>
>> https://www.safaribooksonline.com/library/view/apache-mahout-clustering/9781783284436/ch04.html
>>
>> https://ymnliu.wordpress.com/2015/11/05/install-apache-mahout-in-eclipse/
>>
>> https://mahout.apache.org/
>>
>> What do you say about these sites !! Is these sites are not appropriate
>> ???
>>
>> I raise my problem several time, in mailing list and even IRC but I got
>> response !!  just today :(
>>
>> So finally, it would be great, if you could reply the answers of my
>> following question .
>>
>> Is Apache Mahout appropriate tool for clustering sentences through
>> fuzzy-clustering ?
>>
>> If answer is  "YES"
>>
>>     Which version of Mahout ?
>>
>>     Can you write the steps that I need to followed, or give me
>> appropriate documentation (links) ?
>>
>>
>> Thanks
>> Prakash Poudyal
>> Portugal
>>
>>
>>


-- 

Regards
Prakash Poudyal

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message