hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandan Tamrakar <chandan.tamra...@nepasoft.com>
Subject Re: Reading files from local file system
Date Wed, 14 Oct 2009 06:17:39 GMT
Joson , i think it worked after changing these two paramters in
hadoop-site.xml

*  1. fs.default.name*
      used local file system as default file:///
  *2. **mapred.job.tracker
*      previously were using hdfs location and removed now*

   *Thanks
chandan
*
*
On Wed, Oct 14, 2009 at 11:49 AM, Jason Venner <jason.hadoop@gmail.com>wrote:

> If you want to open a local file in hadoop you have 3 simple ways
>
> 1: use file:///path
> 2: get a LocalFileSystem object from the FileSystem
> /**
>   * Get the local file syste
>   * @param conf the configuration to configure the file system with
>   * @return a LocalFileSystem
>   */
>  public static LocalFileSystem getLocal(Configuration conf)
>    throws IOException {
>    return (LocalFileSystem)get(LocalFileSystem.NAME, conf);
>  }
>
> 3: use the java.io File* classes.
>
> On Tue, Oct 13, 2009 at 9:05 AM, Chandan Tamrakar <
> chandan.tamrakar@nepasoft.com> wrote:
>
> > Do I need to change any configuration beside changing the default file
> > system to "local file system' ?
> > I am trying to input for example  input.txt to map job
> >
> > input.txt will contain file location as following
> >
> > file://path/abc1.doc
> > file://path/abc2.doc
> > ..
> > ...
> >
> > map program will read each line from input.txt and process them
> >
> > Do i need to change any configuration ? This is similar to how Nutch
> crawls
> > .
> >
> > any feedbacks would be appreciated
> >
> > thanks
> >
> >
> >
> > On Tue, Oct 13, 2009 at 6:49 AM, Jeff Zhang <zjffdu@gmail.com> wrote:
> >
> > > Maybe you could debug your mapreduce job in eclipse, since you run it
> in
> > > local mode.
> > >
> > >
> > >
> > > On Tue, Oct 13, 2009 at 5:56 AM, Chandan Tamrakar <
> > > chandan.tamrakar@nepasoft.com> wrote:
> > >
> > > >
> > > >
> > > > We are trying to read files from local file system. But when running
> > the
> > > > map
> > > > reduce it is not able to read files from the input location (the
> input
> > > > location is also local file system location).
> > > >
> > > > For this we changed the configuration of the hadoop-site.xml as shown
> > > > below:
> > > >
> > > > /etc/conf/hadoop/hadoop-site.xml
> > > >
> > > > <property>
> > > >    <name>fs.default.name</name>
> > > >    <value>file:///</value>
> > > >  </property>
> > > >
> > > >
> > > >  [admin@localhost ~]$ hadoop jar Test.jar /home/admin/input/test.txt
> > > > output1
> > > >
> > > > Suppose Test.txt is pain text file that contains
> > > > Test1
> > > > Test2
> > > > Test3
> > > >
> > > >
> > > > While running simple MapReduce job we get following exception  "File
> > not
> > > > found exception " , we are using TextInputFormat in our Job
> > configuration
> > > >
> > > >
> > > > 09/10/13 17:26:35 WARN mapred.JobClient: Use GenericOptionsParser for
> > > > parsing the arguments. Applications should implement Tool for the
> same.
> > > > 09/10/13 17:26:35 INFO mapred.FileInputFormat: Total input paths to
> > > process
> > > > : 1
> > > > 09/10/13 17:26:35 INFO mapred.FileInputFormat: Total input paths to
> > > process
> > > > : 1
> > > > 09/10/13 17:26:37 INFO mapred.JobClient: Running job:
> > > job_200910131447_0033
> > > > 09/10/13 17:26:38 INFO mapred.JobClient:  map 0% reduce 0%
> > > > 09/10/13 17:27:00 INFO mapred.JobClient: Task Id :
> > > > attempt_200910131447_0033_m_000000_0, Status : FAILED
> > > > java.io.FileNotFoundException: File
> > > file:/home/admin/Desktop/input/test.txt
> > > > does not exist.
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.jav
> > > > a:420)
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:25
> > > > 9)
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(Checks
> > > > umFileSystem.java:117)
> > > >        at
> > > >
> > org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:275)
> > > >        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:364)
> > > >        at
> > > >
> > >
> >
> org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:206)
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.jav
> > > > a:50)
> > > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
> > > >        at
> > > >
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2210)
> > > >
> > > > However, running in the code as a separate Main method does work
> well.
> > > >
> > > > public static void main (String [] args) throws IOException {
> > > >
> > > >     Configuration conf = new Configuration();
> > > >     FileSystem fs = FileSystem.get(conf);
> > > >
> > > >     Path filenamePath = new Path(theFilename);
> > > >     FSDataOutputStream out = fs.create(new Path("abc.txt"));
> > > >     out.writeUTF("abc");
> > > >     out.close();
> > > >
> > > > }
> > > >
> > > > The above code works fine when running it as a jar in hadoop. The
> above
> > > > code
> > > > successfully creates file in /home/admin/abc.txt when running from
> > admin
> > > > user.
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Chandan Tamrakar
> >
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>



-- 
Chandan Tamrakar

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message