hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geoffry Roberts <geoffry.robe...@gmail.com>
Subject Re: Three Questions
Date Tue, 03 May 2011 21:49:02 GMT
David,

Thanks for the response.

Last thing first:

I am using org.apache.hadoop.mapreduce.lib.output.MultipleOutputs

which is differs from what your link points to
org.apache.hadoop.mapred.lib.MultipleOutputs. Using the class you propose,
requires me to use a number of other classes from the same package.  These
used to be deprecated, but apparently are not any more.

Question: Does my package even work?  Must I use the other?

So far as the logging goes, I didn't quite follow your response.  You say
"You can find an individual map or reduce task's logs here:"  but there is
no link.

I am familiar with the drill down that starts by clicking NameNode
Logs/userlogs/job*/attempt*r*/stdout.  Are you recommending something
different?

btw,

In my Reduce class, I have a System.out statement in the setup() method that
works (i.e. I get output.), but similar statements in the reduce() method
yield nada.

On 3 May 2011 13:39, David Rosenstrauch <darose@darose.net> wrote:

> On 05/03/2011 01:21 PM, Geoffry Roberts wrote:
>
>> All,
>>
>> I have three questions I would appreciate if anyone could weigh in on.  I
>> apologise in advance if I sound whiny.
>>
>> 1.  The namenode logs, when I view them from a browser, are displayed with
>> the lines wrapped upon each other as if there were no new line characters
>> ('\n') in the output.  I access these files using the dfshealth.jsp thing
>> that comes in the distribution.  Is this intentional? and Can it be fixed?
>> If I use a browser to look at any other log4j log file, I don't get this.
>>
>> 2.  In my own MR jobs, I place log statements.  The log level in
>> $HADOOP_HOME/conf/log4j.properties is set to INFO.  My log statements are
>> set to INFO, but I get nothing in the user logs, which are a bugger to
>> read
>> (see question 1).  Am I missing something?
>>
>
> You can find an individual map or reduce task's logs here:
>
> * click on (e.g.) the word "reduce" in the UI, which brings you to the "All
> Tasks" page
> * click on a given task ID (e.g., task_201105030249_0004_r_000000)
> * In the "Task logs" column, click on "All"
>
>
>  3.  I am attempting to use
>> org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.
>>       multipleOutputsObject.write(new Text("some text"), value,
>> "my_file_name_here");
>> I am not getting any output files with the names I specify.  Instead I get
>> the part* file names we all know and love so well.  I've looked at the
>> source code for MultipleOutputs and found nothing obvious, but since the
>> logging is not working ( see question 2) need I go on.  Is anyone else
>> having either trouble or success with multiple outputs using the
>> aforementioned class?
>>
>
> We use MultipleOutputs pretty heavily here, and it works fine.  You need to
> initialize each named output before you use it, by doing this:
>
>
> http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html#addNamedOutput%28org.apache.hadoop.mapred.JobConf,%20java.lang.String,%20java.lang.Class,%20java.lang.Class,%20java.lang.Class%29
>
> HTH,
>
> DR
>



-- 
Geoffry Roberts

Mime
View raw message