hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xun TANG <tangxun.al...@gmail.com>
Subject Re: Pipes Property Not Passed In
Date Mon, 29 Apr 2013 00:48:59 GMT
For future searchers, Here's how I solved the problem...

File src/mapred/org/apache/hadoop/mapred/pipes/Submitter.java has lines
if (results.hasOption("inputformat")) {
        setIsJavaRecordReader(job, true);
        job.setInputFormat(getClass(results, "inputformat", job,
                                     InputFormat.class));
      }

this means if 'inputformat' is provided as a property,
'hadoop.pipes.java.recordreader' is overwritten to be true.

I donot believe this is correct, according to the example at
src/examples/pipes/impl/wordcount-nopipe.cc
which provides its own recordreader while utilizes the WordCountInputFormat
located at src/test/org/apache/hadoop/mapred/pipes/WordCountInputFormat.java

I simply commented out the line 'setIsJavaRecordReader(job, true);' and
recompile...

Maybe I am wrong, but if 'inputformat' overrides 'hadoop.pipes.java.
recordreader', why do we even need the property 'hadoop.pipes.java.
recordreader'?


Best,
Alice


On Wed, Apr 24, 2013 at 10:18 AM, Xun TANG <tangxun.alice@gmail.com> wrote:

> I think I pinned down the problem, but still can't find a solution.
>
> It seems like 'hadoop.pipes.java.recordreader' is not the correct
> property name to use?
> Because I tried setting other properties (e.g. hadoop.pipes.executable)
> in conf.xml and they were effectively passed in to pipes when the program
> runs.
>
> Where could I find the source code of pipes reading conf.xml file? Where
> could I find a list of all properties (for pipes) and their default values?
> Best I could find is this
> http://hadoop.apache.org/docs/r1.1.2/commands_manual.html
> but it does not list all properties.
>
> Any suggestion?
>
> Thanks,
> Alice
>
>
> On Tue, Apr 23, 2013 at 10:14 PM, Xun TANG <tangxun.alice@gmail.com>wrote:
>
>> I implemented my own InputFormat/RecordReader, and I try to run it with
>> Hadoop Pipes. I understand I could pass in properties to Pipes program by
>> either:
>>
>> <property>
>>         <name>hadoop.pipes.java.recordreader</name>
>>         <value>false</value>
>> </property>
>>
>> or alterntively "-D hadoop.pipes.java.recordreader=false".
>>
>> However, when I ran with above configuration and with my record reader, I
>> got error
>> Hadoop Pipes Exception: RecordReader defined when not needed. at
>> impl/HadoopPipes.cc:798 in virtual void
>> HadoopPipes::TaskContextImpl::runMap(std::string, int, bool)
>>
>> It pipes did not seem to pick up my setting of
>> hadoop.pipes.java.recordreader as false.
>>
>> I've tried using conf.xml or putting -D or the combine of both. None
>> worked. I've googled the whole day but could not find the reason. Did I
>> miss something here?
>>
>> I am using hadoop-1.0.4.
>>
>> Here is my conf.xml
>>
>> <?xml version="1.0"?>
>> <configuration>
>>   <property>
>>      <name>hadoop.pipes.executable</name>
>>     <value>bin/cpc</value>
>>   </property>
>>   <property>
>>     <name>hadoop.pipes.java.recordreader</name>
>>     <value>false</value>
>>   </property>
>>   <property>
>>     <name>hadoop.pipes.java.recordwriter</name>
>>     <value>true</value>
>>   </property>
>> </configuration>
>>
>> Here is the command
>>
>> $HADOOP pipes \
>> -conf $CONF \
>> -files 0 \
>> -libjars $HADOOP_HOME/build/hadoop-test-1.0.4-SNAPSHOT.jar \
>> -input $IN \
>> -output $OUT \
>> -program bin/$NAME \
>> -reduces 0 -reduce org.apache.hadoop.mapred.lib.IdentityReducer \
>> -inputformat org.apache.hadoop.mapred.pipes.WordCountInputFormat
>>
>> where $CONF is full path to conf.xml
>>
>> I could provide more info if that hellps to determine the reason.
>>
>>
>

Mime
View raw message