hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <jason.had...@gmail.com>
Subject Re: Re: Re: Help in Hadoop
Date Mon, 23 Nov 2009 02:28:20 GMT
set the number of reduce tasks to 1.

2009/11/22 <aa225@buffalo.edu>

> Hi everybody,
>             The 10 different map-reducers store their respective outputs in
> 10
> different files. This is the snap shot
>
> hadoop@zeus:~/hadoop-0.19.1$ bin/hadoop dfs -ls output5
> Found 2 items
> drwxr-xr-x   - hadoop supergroup          0 2003-05-16 02:16
> /user/hadoop/output5/MatrixA-Row1
> drwxr-xr-x   - hadoop supergroup          0 2003-05-16 02:16
> /user/hadoop/output5/MatrixA-Row2
>
> Now when I try to open any of these files I get an error message
> hadoop@zeus:~/hadoop-0.19.1$ bin/hadoop dfs -cat output5/MatrixA-Row1
> cat: Source must be a file.
> hadoop@zeus:~/hadoop-0.19.1$
>
> But if I run
> hadoop@zeus:~/hadoop-0.19.1$ bin/hadoop dfs -cat
> output5/MatrixA-Row1/part-00000
>
> I get the correct output. I do not understand why I have to give this extra
> "part-00000". Now when I run a map reduce task to merge the outputs of all
> the
> files, I give the name of the directory output5 as the Input path. But I
> get a
> bug saying
>
> java.io.IOException: Not a file:
> hdfs://zeus:18004/user/hadoop/output5/MatrixA-Row1
>
> I cannot understand how to make the frame work read my files.
>
> Alternatively I tried to avoid the map reduce approach for combining files
> and do
> it via a simple program, but I am unable to start. Can some one give me
> some
> sample implementation or something.
>
> Any help is appreciated
>
> Thank You
>
> Abhishek Agrawal
>
> SUNY- Buffalo
> (716-435-7122)
>
> On Sun 11/22/09  5:48 PM , aa225@buffalo.edu sent:
> > Hellow,
> > If I write the output of the 10 tasks in 10 different files then how do
> > Igo about merging the output ? Is there some in built functionality or do
> I
> > haveto write some code for that ?
> >
> > Thank You
> >
> > Abhishek Agrawal
> >
> > SUNY- Buffalo
> > (716-435-7122)
> >
> > On Sun 11/22/09  5:40 PM , Gang Luo lgpubli
> > c@yahoo.com.cn sent:> Hi. If the output path already exists, it seems
> > you could not execute any> task with the same output path. I think you
> can
> > output the results of the> 10 tasks to 10 different paths, and then do
> sth
> > more (by the 11th task, for> example) to merge the 10 results into 1
> file.
> > >
> > > Gang Luo
> > > ---------
> > > Department of Computer Science
> > > Duke University
> > > (919)316-0993
> > > gang.luo@du> ke.edu
> > >
> > >
> > > -----
> > å��å&Acir
> > c;§ï¿½Ã©ï¿½Â&r
> > eg;件 ---->
> > å��ä&A
> > circ;»Â¶Ã¤ÂºÂºÃ
> > ¯Â¼ï¿½ "aa225@buffa> lo.edu" <aa225@buffa>
> > lo.edu>�Ã&curre
> > n;»¶äºº&Ati
> > lde;¯Â¼ï¿½ common-user@hadoop.apache.orgå�ï
> > ¿½Ã©ï¿½ï¿½&
> > Atilde;¦ï¿½Â¥Ã¦ï&iqu
> > est;½ï¿½Ã¯Â¼ï&iques
> > t;½ 2009/11/22>
> > (å�¨æï&ique
> > st;½Â¥) 5:25:55
> > ä¸�åï&iqu
> > est;½ï¿½Ã¤Â¸Â&raq
> > uo;
> > é¢�ïÂ&frac
> > 14;� Help in Hadoop>
> > > Hello Everybody,
> > > I have a doubt in a map reduce program and I
> > would appreciate any> help. I run the program using the command
> > bin/hadoop jar HomeWork.jar prg1> inputoutput. Ideally from within prg1,
> I want to
> > sequentially launch 10 map-> reducetasks. I want to store the output of
> all
> > these map reduce tasks in some> file.Currently I have kept the input
> format and
> > output format of the jobs as> TextInputFormat and TextOutputFormat
> > respectively. Now I have the> followingquestions.
> > >
> > > 1. When I run more than 1 task from the same
> > program, the output file of> all thetasks is same. The framework does not
> > allows the 2   map reduce task to> have thesame output file as task 1.
> > >
> > > 2. Before the 2 task launches I also get this
> > error >
> > > Cannot initialize JVM Metrics with
> > processName=JobTracker, sessionId= -> alreadyinitialized
> > >
> > > 3. When the 2 map reduce tasks writes its output
> > to file> "output", wont theprevious content of
> > this file get over written ?>
> > > Thank You
> > >
> > > Abhishek Agrawal
> > >
> > > SUNY- Buffalo
> > > (716-435-7122)
> > >
> > >
> > >
> > ___________________________________________________________ >
> > 好ç�
> > ;©è´ºåï&i
> > quest;½Â¡&cce>
> > dil;­�ä½Â
> > ; �Ã&ma
> > cr;¼�é�
> > ;®ç>
> > ;®±è´ºÃ
> > ;¥ï¿½Â¡Ã¥ï¿&frac1
> > 2;¨æ�°>
> > ;ä¸�çÂ&
> > ordm;¿ï¼� http://card.mail.cn.yahoo.com/>
> > >
> > >
> >
> >
> >
>
>


-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message