mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <jeast...@Narus.com>
Subject RE: where i can set -Dmapred.map.tasks=X
Date Tue, 04 Jan 2011 17:43:22 GMT
It's odd though, that kmeans works correctly with multiple -D arguments, even though it uses
the ToolRunner.run(Configuration,Tool,String[]). Are you sure about the semantics difference?
It's not obvious from the javadocs.

-----Original Message-----
From: Jeff Eastman [mailto:jeastman@narus.com] 
Sent: Tuesday, January 04, 2011 9:09 AM
To: user@mahout.apache.org
Subject: RE: where i can set -Dmapred.map.tasks=X

Ok, this seems to be a more widespread problem. Let's identify all the places that need to
be touched and I will commit them all at the same time.

-----Original Message-----
From: Shige Takeda [mailto:stakeda@yahoo-inc.com] 
Sent: Tuesday, January 04, 2011 9:03 AM
To: user@mahout.apache.org
Subject: Re: where i can set -Dmapred.map.tasks=X

Hello,

Coincidentally I came across the same problem last week and found the 
cause is Seq2Sparse's main didn't use ToolRunner.run(Tool,String[]), 
which automatically feeds -D parameters into a configuration object, 
which is accessible by Configurable.getConf().

Also I see a lot of driver main functions, especially around 
clusterings, don't use TooRunner.run(Tool,String[]) but 
ToolRunner.run(Configuraiton,Too,String[]). A problem with the latter 
one is it doesn't consider the passed -D parameters.

See the difference in this javadoc.
http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/util/ToolRunner.html

FYI, a specific problem to me is -Dmapred.job.queue.name=something is 
required when I run a job in the company's Hadoop cluster.

Btw, any correction/suggestion to my comment is welcome as I'm also 
learning codes since last month.

Thanks,
-- Shige Takeda

On 1/3/2011 8:27 PM, Jeff Eastman wrote:
> Seq2Sparse has this problem too? Not good. Users really need -D
> abilities there. How about you JIRA your patch and I will get it in?
>
>
> On 1/3/11 7:43 PM, Dmitriy Lyubimov wrote:
>> Jeff, i also have a similar patch for seq2sparse. Not sure if it makes a lot
>> of sense there since it is a composite job and i am not sure if
>> configuration is propagated to those. But i got it too if need be.
>>
>> On Mon, Jan 3, 2011 at 5:36 PM, Dmitriy Lyubimov<dlieu.7@gmail.com>   wrote:
>>
>>> Resolved in mahout-574.
>>>
>>>
>>> On Mon, Jan 3, 2011 at 3:49 PM, Jeff Eastman<jdog@windwardsolutions.com>wrote:
>>>
>>>> Yes, it could indeed. See my previous email which shows the problem unique
>>>> to this class.
>>>>
>>>>
>>>> On 1/3/11 3:30 PM, Dmitriy Lyubimov wrote:
>>>>
>>>>> Could it be because of SequenceFileFromDirectory is not an AbstractJob?
>>>>>
>>>>>



Mime
View raw message