hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4149) JobQueueJobInProgressListener.jobUpdated() might not work as expected
Date Tue, 07 Oct 2008 10:29:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637439#action_12637439
] 

Hemanth Yamijala commented on HADOOP-4149:
------------------------------------------

I think we need the list of objects maintained by the schedulers to be in sorted order, where
the sorting is done based on parameters like job priority or start times. This is true irrespective
of whether we use {{JobInProgress}} or any new object like the proposed {{JobSchedulingInfo}}.
And a {{SortedSet}}, like a {{TreeSet}} seems to be required here. However, this also means
that when we want to delete (or update) an object, we would need to iterate over all the objects
in the structure to find the one we want to remove. This is because our sort ordering is based
on different parameters from how we treat the objects as equal (which for all practical purposes
should be based on the Job id).

I also think updates are rare operations, compared to accessing the list in a sorted order,
which happens almost everytime the scheduler needs to look at a new job. Hence, I think it
makes sense to go with an implementation that has an O(n) scan for removals.

> JobQueueJobInProgressListener.jobUpdated() might not work as expected
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-4149
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4149
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> {{JobQueueJobInProgressListener}} uses a {{TreeSet}} to store the sorted collection of
{{JobInProgress}} objects. The comparator used to sort the JIPs follow the following order
> - priority (>=)
> - start time (<=)
> - job id [jt-identifier, job-index] (<=)
> If any JIP object is changed w.r.t priority or start-time, then the TreeSet will be inconsistent.
Hence doing  a delete might not work. Consider the following
> 1) jobs are submitted in the following order 
> ||number||jobid||priority||
> |1|j1|NORMAL|
> |2|j2|LOW|
> |3|j3|NORMAL|
> 2) The sorted collection will be in the order : {{j1,j3,j2}}
> 3) If job3's priority is changed to LOW then the collection wont change but delete will
bail out on j1 itself as the comparator will return a -ve number. TreeSet uses the comparator
both for sorting and deleting. If  i indicates the index in the collection and obj represents
the object under consideration, then looks like TreeSet.remove(obj) follows something like
 :
> - continue to search if the compare(i, obj) is -ve
> - bail out if the compare(i, obj) is +ve
> - delete the obj of compare(i,obj) == 0

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message