hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars George <lars.geo...@gmail.com>
Subject Re: mapper not running on the same region server
Date Fri, 09 Dec 2011 11:00:38 GMT
I do not have the exact code in my head, but would assume that when you kill the job the correct
node is properly up for grabs. I was referring to the name of the nodes as the JobTracker
sees it vs what HBase reports them as when the getSplits() of the input format is called.
It might be that this differs and therefore the framework does not take the locality hint
into consideration.

Lars

On Dec 9, 2011, at 10:29 AM, Rohit Kelkar wrote:

> Hi Lars, by naming issue, do you mean if the zookeeper nodes and hbase
> nodes are correctly configured?
> I observed that this issue occurs intermittently. Sometimes the mapper
> gets scheduled on the correct node. Would that be because I am killing
> the job frequently and hadoop is  prioritizing the nodes based on how
> often (or less often) the scheduled job successfully completes?
> 
> - Rohit Kelkar
> 
> On Fri, Dec 9, 2011 at 2:31 PM, Lars George <lars.george@gmail.com> wrote:
>> Hi,
>> 
>> Do you have maybe an issue with naming. HBase takes the hostname (as shown in the
UI and the ZK dump there) and hints that to the MR framework. But if that resolves to different
names, then no match can be made and the node to run the task on is chosen by random. Could
you verify?
>> 
>> Lars
>> 
>> On Dec 7, 2011, at 6:21 AM, Rohit Kelkar wrote:
>> 
>>> My hadoop cluster has 3 nodes in it and hbase too runs on the same 3
>>> nodes. But the table that I am speaking of has only one region and
>>> http://master:50030/jobtracker.jsp shows only one mapper running.
>>> - Rohit Kelkar
>>> 
>>> On Tue, Dec 6, 2011 at 8:38 PM, Stack <stack@duboce.net> wrote:
>>>> On Tue, Dec 6, 2011 at 12:50 AM, Rohit Kelkar <rohitkelkar@gmail.com>
wrote:
>>>>> I am running a mapreduce job on a hbase table. I hava a 3 node
>>>>> cluster. Currently the table has only a few rows. When I visit the
>>>>> http://master:60010/master.jsp I can see that the table resides on
>>>>> only one region server. When I run my mapreduce job on this table I
>>>>> see the mapper running on a different node of my cluster. Shouldn't
>>>>> the mapper be running on the same node that hosts the table?
>>>>> I am using the TableMapReduceUtil.initTableMapperJob method to
>>>>> initialize the mapreduce job.
>>>>> 
>>>> 
>>>> Yes.  Mappers should be running by the data.
>>>> 
>>>> You have only one region in your table or more than one region and
>>>> more than one mapper is running?
>>>> 
>>>> St.Ack
>> 


Mime
View raw message