mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Release tomorrow?
Date Fri, 15 Oct 2010 03:53:46 GMT
There is often a small delay before files appear in HDFS after they are
created.  This has buggered many a work-flow.

On Thu, Oct 14, 2010 at 8:40 PM, Jeff Eastman <jdog@windwardsolutions.com>wrote:

>  On 10/14/10 7:47 PM, Jeff Eastman wrote:
>
>>  The recent commit to the POM fixed my build problem on my clean RedHat
>> box. Currently, build-reuters.sh is failing to run the k-means step on
>> Hadoop on that box and it looks like it is the same problem we've been
>> seeing with others running the Cloudera CDH3: hadoop is running under a
>> different user and the local file references don't resolve correctly when
>> the job is run under mine. I haven't yet figured out the best way to fix
>> this or why the other build-reuters job steps don't have this problem (they
>> all use ./examples... file paths too).
>>
> It looks like the RandomSeedGenerator.buildRandom() is somehow seeing an
> empty input directory when it really has an 11.6 mb part file in it. The
> EOFException occurs when executing: SequenceFile.Reader reader = new
> SequenceFile.Reader(fs, fileStatus.getPath(), conf); on line 84. There are
> hdfs and mapred PIDs associated with the hadoop daemons, but why would that
> matter? The files in hdfs are all under /users/dev/examples... and my jobs
> are running as dev so I don't get why this is happening.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message