hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Streaming jar creates only 1 reducer
Date Sat, 22 Oct 2011 06:42:45 GMT
Mapred,

Glad to know you are able to control your # of reducers now.

I believe you answered yourself there :)

Doing http://en.wikipedia.org/wiki/Crossposting often leads to nose,
confusion and duplicated response efforts on the lists.

On Sat, Oct 22, 2011 at 10:52 AM, Mapred Learn <mapred.learn@gmail.com> wrote:
> Thanks Harsh !
> This is exactly what I thought !
>
> And don't know what you mean by cross-post ? I just posted to mapred and HDFS mailing
lists ? What's your point about cross-pointing ??
>
> Sent from my iPhone
>
> On Oct 21, 2011, at 8:57 PM, Harsh J <harsh@cloudera.com> wrote:
>
>> Mapred,
>>
>> You need to pass -Dmapred.reduce.tasks=N along. Reducers are a per-job configurable
number, unlike mappers whose numbers can be determined based on inputs.
>>
>> P.s. Please do not cross post questions to multiple lists.
>>
>> On 22-Oct-2011, at 4:05 AM, Mapred Learn wrote:
>>
>>> Do you know what parameters from conf files ?
>>>
>>> Thanks,
>>>
>>> Sent from my iPhone
>>>
>>> On Oct 21, 2011, at 3:32 PM, Nick Jones <darellik@gmail.com> wrote:
>>>
>>>> FWIW, I usually specify the number of reducers in both streaming and
>>>> against the Java API. The "default" is what's read from your config
>>>> files on the submitting node.
>>>>
>>>> Nick Jones
>>>>
>>>> On Oct 21, 2011, at 5:00 PM, Mapred Learn <mapred.learn@gmail.com>
wrote:
>>>>
>>>>> Hi,
>>>>> Does streaming jar create 1 reducer by default ? We have reduce tasks
per task tracker configured to be more than 1 but my job has about 150 mappers and only 1
reducer:
>>>>>
>>>>> reducer.py basically just reads the line and prints it.
>>>>>
>>>>> Why doesn't streaming.jar invokes multiple reducers for this case ?
>>>>>
>>>>> Thanks,
>>>>> -JJ
>>>>>
>>>>>
>>
>



-- 
Harsh J

Mime
View raw message