mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: Error while trying to use mahout/examples/bin/build-reuters.sh
Date Mon, 10 May 2010 20:28:18 GMT
Hi Florent,

I successfully ran the new build-reuters.sh before I committed it this 
morning, so I suspect you must have some other problem in your system. 
Have you tried deleting your Maven repository (.m2) and doing a full mvn 
clean install?

Jeff

On 5/10/10 12:50 PM, Florent Empis wrote:
> Hi,
>
> I've seen the commit from Robin this afternoon so I gave it another try.
> Using the new shell I still run into a few problems
> At first, in order to satisfy a dependency to slf4j I've had to add the
> following to examples/pom.xml (once again I'm not a maven expert, so this
> may not be the correct way to do it)
>
> <dependency>
>    <groupId>org.slf4j</groupId>
>    <artifactId>slf4j-nop</artifactId>
>    <version>1.5.8</version>
>    <classifier>sources</classifier>
> </dependency>
>
> Then, after a succesful mvn -B
> I've launched the shell:
> florent@florent-laptop:~/workspace/mahout$ ./examples/bin/build-reuters.sh
>
> It fails with the following error:
> 10/05/10 21:28:06 WARN mapred.LocalJobRunner: job_local_0001
> java.io.IOException: The temporary job-output directory
> file:/tokenized-documents/_temporary doesn't exist!
> at
> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:204)
> at
> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:234)
> at
> org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:48)
> at
> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:662)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 10/05/10 21:28:07 INFO mapred.JobClient:  map 0% reduce 0%
> 10/05/10 21:28:07 INFO mapred.JobClient: Job complete: job_local_0001
> 10/05/10 21:28:07 INFO mapred.JobClient: Counters: 0
> 10/05/10 21:28:07 ERROR driver.MahoutDriver: MahoutDriver failed with args:
> [-i, ./examples/bin/work/reuters-out-seqdir/, -o,
> ./examples/bin/work/reuters-out-seqdir-sparse, null]
> Job failed!
> Exception in thread "main" java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
> at
> org.apache.mahout.utils.vectors.text.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:97)
> at
> org.apache.mahout.text.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:215)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172)
>
> A find makes me think that the issue is
> in /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java
> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java:
>   public static final String TOKENIZED_DOCUMENT_OUTPUT_FOLDER =
> "/tokenized-documents";
>
> I tried changing this value, but it did not solve my problem, although I did
> a mvn -B on utils afterwards.... it looks like the mahout-utils used by the
> test comes from somewhere else: I guess there's something I'm missing....
>
>
>
>
> 2010/5/10 Jeff Eastman<jdog@windwardsolutions.com>
>
>    
>> I will commit once I verify it completes.  It's running now...
>> Jeff
>>
>>
>> On 5/10/10 7:50 AM, Robin Anil wrote:
>>
>>      
>>> +1. Should be using bin/mahout script for all these.
>>>
>>>
>>> Robin
>>>
>>>
>>> On Mon, May 10, 2010 at 8:12 PM, Jeff Eastman<jdog@windwardsolutions.com
>>>        
>>>> wrote:
>>>>          
>>>
>>>
>>>        
>>>> Well, thanks for the info. Perhaps we should replace the script then.
>>>> Leaving time bombs around like this is not good.
>>>> Jeff
>>>>
>>>>
>>>> On 5/10/10 7:32 AM, Robin Anil wrote:
>>>>
>>>>
>>>>
>>>>          
>>>>> thats been broken for a long time, it was used by David while he
>>>>> developed
>>>>> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to
>>>>> convert
>>>>> reuters to vectors, its up on the wiki
>>>>>
>>>>> Robin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>            
>>>>
>>>>
>>>>          
>>>
>>>        
>>
>>      
>    


Mime
View raw message