hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dru Jensen <drujen...@gmail.com>
Subject Re: Duplicate rows being processed when one MR task completes. Endless Loop in MR task.
Date Tue, 23 Sep 2008 16:29:52 GMT
St.Ack,

Yes.  That is correct. The speculative task also got stuck in the  
endless loop.  I removed the filter and have not encountered the  
endless loop since.

I am in the process of upgrading to hadoop 0.18.1 and hbase 0.18.0.  I  
will let you know If I can duplicate this and capture debug info.

thanks,
Dru

On Sep 22, 2008, at 9:24 PM, stack wrote:

> 6 tasks so you have 6 regions in your table Dru?
>
> You might enable DEBUG to the map to see that the map or the  
> filtering to ensure its not stuck processing same key somehow over  
> and over (then a speculative task starts up because the stuck task  
> is taking too long to finish, and so on...)
>
> St.Ack
>
>
> Dru Jensen wrote:
>> More information:
>>
>> When I first launch the Job, 6 MR "tasks" are created on 3  
>> different servers in the cluster. Each "task" has 1 "task attempt"  
>> started.
>> Hadoop map task list for job_200809191015_0010 on machine1
>>
>> All Tasks
>> Task    Complete    Status    Start Time    Finish Time     
>> Errors    Counters
>> tip_200809191015_0010_m_000001    100.00%
>>
>> 19-Sep-2008 10:46:51
>> 19-Sep-2008 11:08:56 (22mins, 4sec)
>>
>> 8
>> tip_200809191015_0010_m_000002    100.00%
>>
>> 19-Sep-2008 10:46:52
>> 19-Sep-2008 11:03:41 (16mins, 48sec)
>>
>> 8
>> tip_200809191015_0010_m_000003    100.00%
>>
>> 19-Sep-2008 10:46:53
>> 19-Sep-2008 11:02:15 (15mins, 22sec)
>>
>> 8
>> tip_200809191015_0010_m_000004    100.00%
>>
>> 19-Sep-2008 10:46:53
>> 19-Sep-2008 10:56:14 (9mins, 20sec)
>>
>> 8
>> tip_200809191015_0010_m_000000    0.00%
>>
>> 19-Sep-2008 10:46:51
>>
>>
>> 0
>> tip_200809191015_0010_m_000005    0.00%
>>
>> 19-Sep-2008 10:46:55
>>
>>
>> 0
>>
>> Go back to JobTracker
>> Hadoop, 2008.
>>
>>
>> When 2 of the"tasks" complete, it looks like one of the still  
>> running "tasks" gets a new "task attempt" started.
>> Unfortunately the new "task attempt" is handed the same keys as the  
>> first "task attempt" so they are processing the same keys twice.
>>
>> Job job_200809191015_0010
>>
>> All Task Attempts
>> Task Attempts    Machine    Status    Progress    Start Time     
>> Finish Time    Errors    Task Logs    Counters    Actions
>> task_200809191015_0010_m_000000_0    machine2    RUNNING    0.00%
>> 19-Sep-2008 10:46:51
>> Last 4KB
>> Last 8KB
>> All
>> 0
>> task_200809191015_0010_m_000000_1    machine1    RUNNING    0.00%
>> 19-Sep-2008 11:02:15
>> Last 4KB
>> Last 8KB
>> All
>> 0
>>
>> Go back to the job
>> Go back to JobTracker
>> Hadoop, 2008.
>>
>> This is also the scenario that is causing the endless loop.  Both  
>> "task attempts" not only process the same keys, they start  
>> processing the same keys over and over in an endless loop.
>>
>>
>> On Sep 19, 2008, at 10:52 AM, Dru Jensen wrote:
>>
>>> Sorry.  Hadoop 0.17.2.1 - Hbase 0.2.1
>>>
>>> On Sep 19, 2008, at 10:40 AM, Jean-Daniel Cryans wrote:
>>>
>>>> Dru,
>>>>
>>>> Which versions?
>>>>
>>>> Thx
>>>>
>>>> J-D
>>>>
>>>> On Fri, Sep 19, 2008 at 1:38 PM, Dru Jensen <drujensen@gmail.com> 

>>>> wrote:
>>>>
>>>>> I have a MR process that gets stuck in an endless loop.  It  
>>>>> looks like the
>>>>> same set of keys are being sent to one of the tasks in an  
>>>>> endless loop.
>>>>> Unfortunately, Its not consistent.  Sometimes it works fine.   
>>>>> Only 1 of the
>>>>> 6 MR processes gets in this state and never completes.
>>>>> After the disk space is used up on the HFS, the tables become  
>>>>> corrupt and I
>>>>> can no longer recover them.
>>>>>
>>>>> The main difference from other MR processes I have is that I  
>>>>> added a filter
>>>>> to the MR process table scanner by extending the  
>>>>> TableInputFormatBase:
>>>>>
>>>>> public class TableInputFormatColumnFilter extends  
>>>>> TableInputFormatBase
>>>>> implements JobConfigurable {...}
>>>>>
>>>>> and then adding a ColumnValueFilter in the configure() as follows:
>>>>>
>>>>> ColumnValueFilter rowFilter = new
>>>>> ColumnValueFilter(Bytes.toBytes("column_name"),
>>>>> ColumnValueFilter.CompareOp.LESS_OR_EQUAL,  
>>>>> Bytes.toBytes("column_value"));
>>>>> setRowFilter(rowFilter);
>>>>>
>>>>> Any ideas what may be causing this?
>>>
>>
>>
>


Mime
View raw message