hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joan <joan.monp...@gmail.com>
Subject Re: How to reduce number of splits in DataDrivenDBInputFormat?
Date Thu, 20 Jan 2011 07:33:49 GMT
Hi Sonal,

I put both configurations:

        job.getConfiguration().set("mapreduce.job.maps","4");
        job.getConfiguration().set("mapreduce.map.tasks","4");

But both configurations don't run. I also try to set "mapred.map.task" but
It neither run.

Joan

2011/1/20 Sonal Goyal <sonalgoyal4@gmail.com>

> Joan,
>
> You should be able to set the mapred.map.tasks property to the maximum
> number of mappers you want. This can control parallelism.
>
> Thanks and Regards,
> Sonal
> <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
> Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
> Nube Technologies <http://www.nubetech.co>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>
>
>
>
> On Wed, Jan 19, 2011 at 9:32 PM, Joan <joan.monplet@gmail.com> wrote:
>
>> Hi,
>>
>> I want to reduce number of splits because I think that I get many splits
>> and I want to reduce these splits.
>> While my job is running I can see:
>>
>> *INFO mapreduce.Job:  map ∞% reduce 0%*
>>
>> I'm using DataDrivenDBInputFormat:
>> *
>> ** setInput*
>>
>> *public static void setInput(Job <http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/Job.html>
job,
>>                             Class <http://java.sun.com/javase/6/docs/api/java/lang/Class.html?is-external=true><?
extends DBWritable <http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/lib/db/DBWritable.html>>
inputClass,
>>
>>
>>                             String <http://java.sun.com/javase/6/docs/api/java/lang/String.html?is-external=true>
tableName,
>>                             String <http://java.sun.com/javase/6/docs/api/java/lang/String.html?is-external=true>
conditions,
>>
>>
>>                             String <http://java.sun.com/javase/6/docs/api/java/lang/String.html?is-external=true>
splitBy,
>>                             String <http://java.sun.com/javase/6/docs/api/java/lang/String.html?is-external=true>...
fieldNames)*
>>
>> *Note that the "orderBy" column is called the "splitBy" in this version.
>> We reuse the same field, but it's not strictly ordering it -- just
>> partitioning the results.
>> *
>>
>> So I get all data from myTable and I try to split by date column. I obtain
>> milions rows and I supose that DataDrivenDBInputFormat generates many splits
>> and i don't know how to reduce this splits or how to indicates to
>> DataDrivenDBInputFormat splits by my date column (corresponds to splitBy).
>>
>> The main goal's improve performance, so I want to my Map's faster.
>>
>>
>> Can someone help me?
>>
>> Thanks
>>
>> Joan
>>
>>
>>
>>
>

Mime
View raw message