mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Kumar <manoj1...@gmail.com>
Subject Re: LDA Mahout
Date Tue, 01 Mar 2011 06:51:11 GMT
Hi Jeff Eastman,
Is there any options to perform stopwords removal while performing LDA in
mahout or while creating sequence files from the corpus?
Kindly reply.

Thanks & Regards,
Manoj Kumar.R.K
Graduate Student, MS Computer Science
University at Buffalo
Buffalo, New York
(413) 461-8938|www.rkmanojkumar.co.nr



On Mon, Feb 28, 2011 at 1:06 PM, Manoj Kumar <manoj1987@gmail.com> wrote:

> Hi Jeff Eastman,
>
> Thanks a lot. I ll look into it and will contact you in case of any help.
>
> Thanks & Regards,
> Manoj Kumar.R.K
> Graduate Student, MS Computer Science
> University at Buffalo
> Buffalo, New York
> (413) 461-8938|www.rkmanojkumar.co.nr
>
>
>
> On Mon, Feb 28, 2011 at 12:48 PM, Jeff Eastman <jeastman@narus.com> wrote:
>
>> Look at examples/bin/build-reuters.sh for some examples. They are all from
>> the command line but illustrate the best way to do what you are attempting.
>> https://cwiki.apache.org/confluence/display/MAHOUT/K-Means+Clusteringalso has some
example code for doing text processing.
>>
>> -----Original Message-----
>> From: Manoj Kumar [mailto:manoj1987@gmail.com]
>> Sent: Monday, February 28, 2011 9:28 AM
>> To: user@mahout.apache.org
>> Subject: Re: LDA Mahout
>>
>> Hi Jeff Eastman,
>> Thanks for your reply. I looked into the LDADriver Class. But am not sure
>> as
>> how to convert my text documents to Sequence Files and then to
>> SparseVectors
>> for giving input to LDADriver. Can you please help me in this conversion.
>> ALso, is it enough to just call the run method in LDADriver Class with
>> appropriate inputs for modeling the topics?
>>
>> Thanks & Regards,
>> Manoj Kumar.R.K
>> Graduate Student, MS Computer Science
>> University at Buffalo
>> Buffalo, New York
>> (413) 461-8938|www.rkmanojkumar.co.nr
>>
>>
>>
>> On Mon, Feb 28, 2011 at 12:23 PM, Jeff Eastman <jeastman@narus.com>
>> wrote:
>>
>> > Have you looked at the Java classes that implement LDA? The private
>> > LDADriver.run() method should be made public, but this can be called
>> from
>> > Java in Eclipse (if that is what you mean by "using Eclipse"). You could
>> > also look at the wiki for information on running LDA (
>> >
>> https://cwiki.apache.org/confluence/display/MAHOUT/Latent+Dirichlet+Allocation
>> > ).
>> >
>> > -----Original Message-----
>> > From: Manoj Kumar [mailto:manoj1987@gmail.com]
>> > Sent: Monday, February 28, 2011 9:09 AM
>> > To: user@mahout.apache.org
>> > Subject: LDA Mahout
>> >
>> > Hi,
>> >
>> > I am doing a project which requires topic modeling of documents using
>> LDA.
>> > I
>> > am planning to implement this using Mahout LDA. I am not able to get any
>> > sample codes for implementing this using Eclipse. Only command line
>> options
>> > where available. Kindly suggest me some tutorial or please provide me
>> some
>> > basic code for implementing LDA. Kindly reply.
>> >
>> > Thanks & Regards,
>> > Manoj Kumar.R.K
>> > Graduate Student, MS Computer Science
>> > University at Buffalo
>> > Buffalo, New York
>> > (413) 461-8938|www.rkmanojkumar.co.nr
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message