hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aa...@buffalo.edu
Subject Re: Re: Re: Help in Hadoop
Date Mon, 23 Nov 2009 01:41:55 GMT
Hi everybody,
             The 10 different map-reducers store their respective outputs in 10
different files. This is the snap shot 

hadoop@zeus:~/hadoop-0.19.1$ bin/hadoop dfs -ls output5
Found 2 items
drwxr-xr-x   - hadoop supergroup          0 2003-05-16 02:16
/user/hadoop/output5/MatrixA-Row1
drwxr-xr-x   - hadoop supergroup          0 2003-05-16 02:16
/user/hadoop/output5/MatrixA-Row2

Now when I try to open any of these files I get an error message 
hadoop@zeus:~/hadoop-0.19.1$ bin/hadoop dfs -cat output5/MatrixA-Row1
cat: Source must be a file.
hadoop@zeus:~/hadoop-0.19.1$

But if I run 
hadoop@zeus:~/hadoop-0.19.1$ bin/hadoop dfs -cat output5/MatrixA-Row1/part-00000

I get the correct output. I do not understand why I have to give this extra
"part-00000". Now when I run a map reduce task to merge the outputs of all the
files, I give the name of the directory output5 as the Input path. But I get a
bug saying 

java.io.IOException: Not a file: hdfs://zeus:18004/user/hadoop/output5/MatrixA-Row1

I cannot understand how to make the frame work read my files.

Alternatively I tried to avoid the map reduce approach for combining files and do
it via a simple program, but I am unable to start. Can some one give me some
sample implementation or something. 

Any help is appreciated

Thank You

Abhishek Agrawal

SUNY- Buffalo
(716-435-7122)

On Sun 11/22/09  5:48 PM , aa225@buffalo.edu sent:
> Hellow,
> If I write the output of the 10 tasks in 10 different files then how do
> Igo about merging the output ? Is there some in built functionality or do I
> haveto write some code for that ?
> 
> Thank You
> 
> Abhishek Agrawal
> 
> SUNY- Buffalo
> (716-435-7122)
> 
> On Sun 11/22/09  5:40 PM , Gang Luo lgpubli
> c@yahoo.com.cn sent:> Hi. If the output path already exists, it seems
> you could not execute any> task with the same output path. I think you can
> output the results of the> 10 tasks to 10 different paths, and then do sth
> more (by the 11th task, for> example) to merge the 10 results into 1 file.
> > 
> > Gang Luo
> > ---------
> > Department of Computer Science
> > Duke University
> > (919)316-0993
> > gang.luo@du> ke.edu
> > 
> > 
> > -----
> å��å&Acir
> c;§ï¿½Ã©ï¿½Â&r
> eg;件 ---->
> å��ä&A
> circ;»Â¶Ã¤ÂºÂºÃ
> ¯Â¼ï¿½ "aa225@buffa> lo.edu" <aa225@buffa>
> lo.edu>�Ã&curre
> n;»¶äºº&Ati
> lde;¯Â¼ï¿½ common-user@hadoop.apache.orgå�ï
> ¿½Ã©ï¿½ï¿½&
> Atilde;¦ï¿½Â¥Ã¦ï&iqu
> est;½ï¿½Ã¯Â¼ï&iques
> t;½ 2009/11/22>
> (å�¨æï&ique
> st;½Â¥) 5:25:55
> ä¸�åï&iqu
> est;½ï¿½Ã¤Â¸Â&raq
> uo;  
> é¢�ïÂ&frac
> 14;� Help in Hadoop> 
> > Hello Everybody,
> > I have a doubt in a map reduce program and I
> would appreciate any> help. I run the program using the command
> bin/hadoop jar HomeWork.jar prg1> inputoutput. Ideally from within prg1, I want to
> sequentially launch 10 map-> reducetasks. I want to store the output of all
> these map reduce tasks in some> file.Currently I have kept the input format and
> output format of the jobs as> TextInputFormat and TextOutputFormat
> respectively. Now I have the> followingquestions.
> > 
> > 1. When I run more than 1 task from the same
> program, the output file of> all thetasks is same. The framework does not
> allows the 2   map reduce task to> have thesame output file as task 1.
> > 
> > 2. Before the 2 task launches I also get this
> error > 
> > Cannot initialize JVM Metrics with
> processName=JobTracker, sessionId= -> alreadyinitialized
> > 
> > 3. When the 2 map reduce tasks writes its output
> to file> "output", wont theprevious content of
> this file get over written ?> 
> > Thank You
> > 
> > Abhishek Agrawal
> > 
> > SUNY- Buffalo
> > (716-435-7122)
> > 
> > 
> >
> ___________________________________________________________ >
> 好ç�
> ;©è´ºåï&i
> quest;½Â¡&cce>
> dil;­�ä½Â
> ; Ã¥ï¿½ï¿½Ã&ma
> cr;¼�é�
> ;®ç>
> ;®±è´ºÃ
> ;¥ï¿½Â¡Ã¥ï¿&frac1
> 2;¨æ�°>
> ;ä¸�çÂ&
> ordm;¿ï¼� http://card.mail.cn.yahoo.com/> 
> > 
> > 
> 
> 
> 


Mime
View raw message