mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Filimon <dangeorge.fili...@gmail.com>
Subject Re: Accessing the local filesystem from AbstractJob
Date Wed, 13 Feb 2013 11:12:34 GMT
I see. Well, my use case was wanting to run the job on one machine,
being lazy and not wanting to put the files on HDFS. :)

On Tue, Feb 12, 2013 at 8:27 PM, Sean Owen <srowen@gmail.com> wrote:
> Yes because the input path is something processed by the jobtracker and
> later the tasktrackers themselves, which won't be on your machine
> (necessarily).
>
> Mappers can read the local file system but it's not clear what may or may
> not be there. Consider the distributed cache for smallish data.
>
>
> On Tue, Feb 12, 2013 at 7:05 PM, Dan Filimon <dangeorge.filimon@gmail.com>wrote:
>
>> When creating my own job driver, I'm unable to give it any inputs from
>> the local file system. An exception gets thrown when starting the job
>> (and trying to get the splits).
>> Apparently the files have to be on HDFS.
>>
>> Is there any way around this (ideally, I'd like it to first look for
>> the file on the local file system and if no file is found, look at
>> HDFS)?
>>

Mime
View raw message