hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Num map task?
Date Thu, 23 Apr 2009 06:46:46 GMT
How do you to add input paths?

On Wed, Apr 22, 2009 at 5:09 PM, nguyenhuynh.mr
<nguyenhuynh.mr@gmail.com> wrote:
> Edward J. Yoon wrote:
>
>> Hi,
>>
>> In that case, The atomic unit of split is a file. So, you need to
>> increase the number of files. or Use the TextInputFormat as below.
>>
>> jobConf.setInputFormat(TextInputFormat.class);
>>
>> On Wed, Apr 22, 2009 at 4:35 PM, nguyenhuynh.mr
>> <nguyenhuynh.mr@gmail.com> wrote:
>>
>>> Hi all!
>>>
>>>
>>> I have a MR job use to import contents into HBase.
>>>
>>> The content is text file in HDFS. I used the maps file to store local
>>> path of contents.
>>>
>>> Each content has the map file. ( the map is a text file in HDFS and
>>> contain 1 line info).
>>>
>>>
>>> I created the maps directory used to contain map files. And the  this
>>> maps directory used to input path for job.
>>>
>>> When i run job, the number map task is same number map files.
>>> Ex: I have 5 maps file -> 5 map tasks.
>>>
>>> Therefor, the map phase is slowly :(
>>>
>>> Why the map phase is slowly if the number map task large and the number
>>> map task is equal number of files?.
>>>
>>> * p/s: Run jobs with: 3 node: 1 server and 2 slaver
>>>
>>> Please help me!
>>> Thanks.
>>>
>>> Best,
>>> Nguyen.
>>>
>>>
>>>
>>>
>>
>>
>>
>>
> Current, I use TextInputformat to set InputFormat for map phase.
>



-- 
Best Regards, Edward J. Yoon
edwardyoon@apache.org
http://blog.udanax.org

Mime
View raw message