mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From praveenesh kumar <praveen...@gmail.com>
Subject Re: How to run Mahout java code from commandline ?
Date Sat, 24 Sep 2011 09:42:57 GMT
Which mahout jars are required to run this code and where I can find them ?
I have this src downloaded .. but there are no jars in the src ?


On Sat, Sep 24, 2011 at 2:35 AM, Paritosh Ranjan <pranjan@xebia.com> wrote:

> Just add the mahout jars in the class path while compiling/executing.
> Search "java jar in classpath" on google.
>
>
> On 24-09-2011 15:01, praveenesh kumar wrote:
>
>> I mean to say..
>>
>> I have this code ..
>>
>>  import java.io.File;
>>  import java.io.IOException;
>>  import java.nio.charset.Charset;
>>  import java.util.ArrayList;
>>  import java.util.Arrays;
>>  import java.util.Collection;
>>  import java.util.HashSet;
>>  import java.util.Map;
>>  import java.util.Set;
>>  import java.util.List;
>>
>>  import org.apache.hadoop.conf.**Configuration;
>>  import org.apache.hadoop.fs.**FileSystem;
>>  import org.apache.hadoop.fs.Path;
>>  import org.apache.hadoop.io.**SequenceFile;
>>  import org.apache.hadoop.io.Text;
>>  //import org.apache.lucene.util.**Attribute;
>>  import org.apache.mahout.common.**FileLineIterable;
>>  import org.apache.mahout.common.**StringRecordIterator;
>>
>>  import org.apache.mahout.fpm.**pfpgrowth.convertors.**
>> ContextStatusUpdater;
>>  import
>> org.apache.mahout.fpm.**pfpgrowth.convertors.**
>> SequenceFileOutputCollector;
>>  import
>> org.apache.mahout.fpm.**pfpgrowth.convertors.string.**
>> StringOutputConverter;
>>
>>
>>
>>  import
>> org.apache.mahout.fpm.**pfpgrowth.convertors.string.**TopKStringPatterns;
>>  import org.apache.mahout.fpm.**pfpgrowth.fpgrowth.FPGrowth;
>>  //import org.apache.mahout.math.map.**OpenLongObjectHashMap;
>>
>>  import org.apache.mahout.common.Pair;
>>
>>  public class DellFPGrowth {
>>
>>     public static void main(String[] args) throws IOException {
>>
>>         Set<String>  features = new HashSet<String>();
>>         String input =
>> "/mnt/hgfs/Hadoop-automation/**new-delltransaction.txt";
>>         int minSupport = 1;
>>         int maxHeapSize = 50;//top-k
>>         String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" ";
>>         Charset encoding = Charset.forName("UTF-8");
>>         FPGrowth<String>  fp = new FPGrowth<String>();
>>         String output = "/tmp/output.txt";
>>         Path path = new Path(output);
>>         Configuration conf = new Configuration();
>>         FileSystem fs = FileSystem.get(conf);
>>
>>
>>         SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf,
>> path,
>> Text.class, TopKStringPatterns.class);
>>
>>
>> fp.**generateTopKFrequentPatterns(
>>                 new StringRecordIterator(new FileLineIterable(new
>> File(input), encoding, false), pattern),
>>                 fp.generateFList(
>>                     new StringRecordIterator(new FileLineIterable(new
>> File(input), encoding, false), pattern),
>>                     minSupport),
>>                 minSupport,
>>                 maxHeapSize,
>>                 features,
>>                 new StringOutputConverter(new
>> SequenceFileOutputCollector<**Text,TopKStringPatterns>(**writer)),
>>                 new ContextStatusUpdater(null));
>>
>>         writer.close();
>>
>>         List<Pair<String,**TopKStringPatterns>>  frequentPatterns =
>> FPGrowth.readFrequentPattern(**fs, conf, path);
>>         for (Pair<String,**TopKStringPatterns>  entry : frequentPatterns)
>> {
>>               System.out.println(entry.**getSecond());
>>         }
>>         System.out.print("\nthe end! ");
>>     }
>>
>> }
>>
>>
>> How should I compile and run using command line..
>> I don't have eclipse on my system. How can I run this code  ?
>>
>> Thanks,
>> Praveenesh
>>
>> On Sat, Sep 24, 2011 at 12:40 PM, Danny Bickson<danny.bickson@gmail.**com<danny.bickson@gmail.com>
>> >wrote:
>>
>>  It is very simple: in the root folder you run (for example for k-means:)
>>> ./bin/mahout kmeans -i ~/usr7/small_netflix_mahout/ -o
>>> ~/usr7/small_netflix_mahout_**output/ --numClusters
>>> 10 -c ~/usr7/small_netflix_mahout/ -x 10
>>>
>>> where ./bin/mahout is used for any mahout application, and the next
>>> keyword
>>> (kmeans in this case) defines the algorithm type.
>>> The rest of the inputs are algorithm specific.
>>>
>>> If you want to add a new application to the existing ones, you need to
>>> edit
>>> conf/driver.classes.props
>>> file and point into your main class.
>>>
>>> Best,
>>>
>>> - Danny Bickson
>>>
>>> On Sat, Sep 24, 2011 at 9:59 AM, praveenesh kumar<praveenesh@gmail.com
>>>
>>>> wrote:
>>>> Hey,
>>>> I have this code written using mahout libraries. I am able to run the
>>>>
>>> code
>>>
>>>> from eclipse
>>>> How can I run the code written in mahout from command line ?
>>>>
>>>> My question is do I have to make a jar file and run it as hadoop jar
>>>> jarfilename.jar class
>>>> or shall I run it using simple java command ?
>>>>
>>>> Can anyone solve my confusion ?
>>>> I am not able to run this code.
>>>>
>>>> Thanks,
>>>> Praveenesh
>>>>
>>>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1410 / Virus Database: 1520/3915 - Release Date: 09/23/11
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message