mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vineet yadav <vineet.yadav.i...@gmail.com>
Subject Re: Hadoop error running Wikipedia exercise
Date Wed, 02 Feb 2011 06:15:01 GMT
Hi Lance,
It is reading from local file system and not from hadoop file system. Please
check hadoop configuration. Since you are getting error while creating
wikipedia dataset. So make sure that you have enough disk space available in
your system, since wikipedia  datasets is huge.
Thanks
Vineet Yadav
On Wed, Feb 2, 2011 at 10:21 AM, Lance Norskog <goksron@gmail.com> wrote:

> Running the datassetcreator on the full wikipedia set:
>
>  bin/mahout wikipediaDataSetCreator -i wiki -o
> ../datasets/wikipediainput -c examples/src/test/resources/country.txt
>
> After some time in I got this error and the job quit. It left no output
> files.
>
> Is this a hiccup, a Hadoop error, or something wrong in Mahout?
>
> ----------------------------
>
> 11/02/01 01:44:52 INFO bayes.WikipediaDatasetCreatorMapper: Configure:
> Input Categories size: 229 Exact Match: false Analyzer:
> org.apache.mahout.analysis.WikipediaAnalyzer
> 11/02/01 01:44:52 INFO mapred.MapTask: Starting flush of map output
> 11/02/01 01:44:52 INFO mapred.MapTask: Finished spill 0
> 11/02/01 01:44:52 INFO mapred.TaskRunner:
> Task:attempt_local_0001_m_028511_0 is done. And is in the process of
> commiting
> 11/02/01 01:44:52 INFO mapred.LocalJobRunner:
> 11/02/01 01:44:52 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_m_028511_0' done.
> 11/02/01 01:45:18 WARN mapred.LocalJobRunner: job_local_0001
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>
> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/file.out
> in any of the configured local directories
>        at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:389)
>        at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>        at
> org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:50)
>        at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:193)
> 11/02/01 01:45:19 INFO mapred.JobClient: Job complete: job_local_0001
> 11/02/01 01:45:19 INFO mapred.JobClient: Counters: 8
> 11/02/01 01:45:19 INFO mapred.JobClient:   FileSystemCounters
> 11/02/01 01:45:19 INFO mapred.JobClient:
> FILE_BYTES_READ=435709583455348
> 11/02/01 01:45:19 INFO mapred.JobClient:
> FILE_BYTES_WRITTEN=72839164345155
> 11/02/01 01:45:19 INFO mapred.JobClient:   Map-Reduce Framework
> 11/02/01 01:45:19 INFO mapred.JobClient:     Combine output records=0
> 11/02/01 01:45:19 INFO mapred.JobClient:     Map input records=10860674
> 11/02/01 01:45:19 INFO mapred.JobClient:     Spilled Records=1164848
> 11/02/01 01:45:19 INFO mapred.JobClient:     Map output bytes=4282654947
> 11/02/01 01:45:19 INFO mapred.JobClient:     Combine input records=0
> 11/02/01 01:45:19 INFO mapred.JobClient:     Map output records=1164848
> 11/02/01 01:45:19 INFO driver.MahoutDriver: Program took 12692646 ms
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message