mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Roberts <supreme.dev....@gmail.com>
Subject Re: Error Running Frequent Itemset Mining Example
Date Wed, 19 May 2010 02:11:45 GMT
Ah, nice!  That's new -- er very recently update.  Cool.  Thanks.

On Tue, May 18, 2010 at 6:50 PM, Jeff Eastman <jdog@windwardsolutions.com>wrote:

> I'm running on a Cloudera Ubuntu based AMI that I subsequently configured
> as in https://cwiki.apache.org/confluence/display/MAHOUT/MahoutEC2
>
> Jeff
>
>
>
> On 5/18/10 6:37 PM, Mike Roberts wrote:
>
>> Nuts, and I was just about to finish my
>>
>> *"A Complete Newb’s Guide to (Installing on EC2) and Actually Running
>> Mahout
>> from the Command Line" *wiki post.
>>
>> Now, I'll have to see where I went wrong.  Which distro are you running?
>>  I
>> started with an Alestic Ubuntu 10.4 AMI (ami-cb97c68e).
>>
>> On Tue, May 18, 2010 at 5:34 PM, Jeff Eastman<jdog@windwardsolutions.com
>> >wrote:
>>
>>
>>
>>> I also brought up a single instance at
>>> http://ec2-184-73-30-93.compute-1.amazonaws.com:50030/jobtracker.jsp and
>>> that ran fine too. It looks to me like the problem, whatever it is, is in
>>> your AMI or its configuration.
>>>
>>> Jeff
>>>
>>>
>>>
>>> On 5/18/10 5:15 PM, Jeff Eastman wrote:
>>>
>>>
>>>
>>>> Welll, I just brought up a 2 node cluster at
>>>>
>>>> http://ec2-174-129-148-227.compute-1.amazonaws.com:50030/jobtracker.jspandit
ran fine.
>>>>
>>>>
>>>>
>>>> On 5/18/10 4:56 PM, Mike Roberts wrote:
>>>>
>>>>
>>>>
>>>>> Single instance.  Thx.
>>>>>
>>>>> On Tue, May 18, 2010 at 4:49 PM, Jeff Eastman<
>>>>> jdog@windwardsolutions.com
>>>>>
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>  Hi Mike,
>>>>>
>>>>>
>>>>>> Shouldn't happen. You running this on a single instance or on a hadoop
>>>>>> cluster? I will see if I can duplicate.
>>>>>>
>>>>>> Jeff
>>>>>>
>>>>>>
>>>>>> On 5/18/10 4:27 PM, Mike Roberts wrote:
>>>>>>
>>>>>>  Hey Guys,
>>>>>>
>>>>>>
>>>>>>> Just trying to get the example mentioned here working:
>>>>>>> https://cwiki.apache.org/MAHOUT/parallelfrequentpatternmining.html.
>>>>>>>
>>>>>>> I downloaded the accidents.dat file and placed it in
>>>>>>> /home/ubuntu/mahout-in/fpm-input.
>>>>>>> I created a directory for the output as
>>>>>>> /home/ubuntu/mahout-in/fpm-out.
>>>>>>> Then, I ran the following command:
>>>>>>> ./bin/mahout fpg --input /home/ubuntu/mahout-in/fpm-input --output
>>>>>>> /home/ubuntu/mahout-in/fpm-out --method mapreduce
>>>>>>>
>>>>>>> It runs for a bit and after the first step I get the following
error:
>>>>>>>
>>>>>>> java.io.IOException: java.lang.ClassNotFoundException:
>>>>>>> org.apache.mahout.common.Pair
>>>>>>>         at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.io.serializer.JavaSerialization$JavaSerializationDeserializer.deserialize(JavaSerialization.java:55)
>>>>>>>
>>>>>>>         at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.io.serializer.JavaSerialization$JavaSerializationDeserializer.deserialize(JavaSerialization.java:36)
>>>>>>>
>>>>>>>         at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.io.DefaultStringifier.fromString(DefaultStringifier.java:75)
>>>>>>>
>>>>>>>         at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.mahout.fpm.pfpgrowth.PFPGrowth.deserializeList(PFPGrowth.java:84)
>>>>>>>
>>>>>>>         at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.mahout.fpm.pfpgrowth.TransactionSortingMapper.setup(TransactionSortingMapper.java:77)
>>>>>>>
>>>>>>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>>>>>>>         at
>>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>>>>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>>>>>>         at
>>>>>>>
>>>>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The step that it was running:
>>>>>>> 10/05/18 23:10:18 INFO pfpgrowth.PFPGrowth: No of Features: 30
>>>>>>> 10/05/18 23:10:18 INFO jvm.JvmMetrics: Cannot initialize JVM
Metrics
>>>>>>> with
>>>>>>> processName=JobTracker, sessionId= - already initialized
>>>>>>> 10/05/18 23:10:18 WARN mapred.JobClient: Use GenericOptionsParser
for
>>>>>>> parsing the arguments. Applications should implement Tool for
the
>>>>>>> same.
>>>>>>> 10/05/18 23:10:19 INFO input.FileInputFormat: Total input paths
to
>>>>>>> process
>>>>>>> :
>>>>>>> 1
>>>>>>> 10/05/18 23:10:19 INFO mapred.JobClient: Running job: job_local_0002
>>>>>>> 10/05/18 23:10:19 INFO input.FileInputFormat: Total input paths
to
>>>>>>> process
>>>>>>> :
>>>>>>> 1
>>>>>>> 10/05/18 23:10:19 INFO mapred.MapTask: io.sort.mb = 100
>>>>>>> 10/05/18 23:10:19 INFO mapred.MapTask: data buffer =
>>>>>>> 79691776/99614720
>>>>>>> 10/05/18 23:10:19 INFO mapred.MapTask: record buffer = 262144/327680
>>>>>>> 10/05/18 23:10:19 WARN mapred.LocalJobRunner: job_local_0002
>>>>>>>
>>>>>>> Anyone know what's going on here, or have a solution?  I verified
>>>>>>> that
>>>>>>> the
>>>>>>> class file (Pair.Java) exists in
>>>>>>> /trunk/core/src/main/java/org/apache/mahout/common.  I did an
mvn
>>>>>>> install
>>>>>>> in
>>>>>>> core just to be sure.  I'm running Hadoop 20.2 on Ubuntu 10.4
on EC2.
>>>>>>>  BTW,
>>>>>>> if it's not obvious, I'm a total Mahout n00b.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message