hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guillaume Polaert <gpola...@cyres.fr>
Subject RE: Text Analysis
Date Thu, 26 Apr 2012 08:02:40 GMT

Yesterday, I've discovered RHive project. It use R-server on each datanode.
Does Somebody tried it ?


-----Message d'origine-----
De : Devi Kumarappan [mailto:kpalania@att.net] 
Envoyé : mercredi 25 avril 2012 22:56
À : common-user@hadoop.apache.org
Objet : Re: Text Analysis

RHaddop package allows you to do statistical anlysis.  we were able to do word cloud on the
text files using rmr and rhdfs packages.

Installtion details for these packages is available in the following link.



From: Charles Earl <charles.cearl@gmail.com>
To: common-user@hadoop.apache.org
Sent: Wed, April 25, 2012 12:20:36 PM
Subject: Re: Text Analysis

If you've got existing R code, you might want to look at this http://www.quora.com/How-can-R-and-Hadoop-be-used-together.
Quora posting, also by Cloudera, or the rhipe R Hadoop package https://github.com/saptarshiguha/RHIPE/wiki
Mahout and Lucene/Solr offer some level of text analysis, although I would not call these
complete text analysis packages.
What I've found are specific algorithms as opposed to a complete package: for example LDA
for topic discovery -- Mahout and Yahoo Research
(https://github.com/shravanmn/Yahoo_LDA) have Hadoop based implementations -- in the case
of Yahoo_LDA the data is stored in HDFS, while the computation is essentially MPI based. Whether
the algorithm reads data from HDFS store and uses another approach other than map reduce is
another question.

On Apr 25, 2012, at 12:47 PM, Jagat wrote:

> There are Api which you can use , offcourse they are third party.
> -----------
> Sent from Mobile , short and crisp.
> On 25-Apr-2012 8:57 PM, "Robert Evans" <evans@yahoo-inc.com> wrote:
>> Hadoop itself is the core Map/Reduce and HDFS functionality.  The 
>> higher level algorithms like sentiment analysis are often done by others.
>> Cloudera has a video from HadoopWorld 2010 about it
>> And there are likely to be other tools like R that can help you out 
>> with it.  I am not really sure if mahout offers sentiment analysis or 
>> not, but you might want to look there too http://mahout.apache.org/
>> --Bobby Evans
>> On 4/25/12 7:50 AM, "karanveer.singh@barclays.com" < 
>> karanveer.singh@barclays.com> wrote:
>> Hi,
>> I wanted to know if there are any existing API's within Hadoop for us 
>> to do some text analysis like sentiment analysis, etc. OR are we to 
>> rely on tools like R, etc. for this.
>> Regards,
>> Karanveer
>> This e-mail and any attachments are confidential and intended solely 
>> for the addressee and may also be privileged or exempt from 
>> disclosure under applicable law. If you are not the addressee, or 
>> have received this e-mail in error, please notify the sender 
>> immediately, delete it from your system and do not copy, disclose or 
>> otherwise act upon any part of this e-mail or its attachments.
>> Internet communications are not guaranteed to be secure or 
>> virus-free.
>> The Barclays Group does not accept responsibility for any loss 
>> arising from unauthorised access to, or interference with, any 
>> Internet communications by any third party, or from the transmission 
>> of any viruses. Replies to this e-mail may be monitored by the 
>> Barclays Group for operational or business reasons.
>> Any opinion or other information in this e-mail or its attachments 
>> that does not relate to the business of the Barclays Group is 
>> personal to the sender and is not given or endorsed by the Barclays 
>> Group.
>> Barclays Bank PLC. Registered in England and Wales (registered no.
>> 1026167).
>> Registered Office: 1 Churchill Place, London, E14 5HP, United 
>> Kingdom.
>> Barclays Bank PLC is authorised and regulated by the Financial 
>> Services Authority.

View raw message