hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3524) JobTracker's processHeartbeat() should not call System.currentTimeMillis() everytime
Date Tue, 10 Jun 2008 12:20:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603845#action_12603845
] 

Amar Kamat commented on HADOOP-3524:
------------------------------------

I ran Brice's test and following are the results 
||num-iters||stale-time (ms)||accurate-time(ms)||
|1000|0| 2|
|10000| 1| 14|
|100000 |3| 126|
|1000000| 5| 1183|
|10000000| 30| 11675|
|100000000 |284| 117503|
This timer thread should be a simple _mostly-waiting_ thread. MR/JT scalability is somewhat
controlled by the heartbeat processing rate of the jobtracker. It will be interesting to know
what fraction of the heartbeat is taken up by the call to {{System.currentTimeMillis()}}.
BTW the timer solution was just a starting point. I believe we can (if required) come up with
a more simpler and elegant solution.


> JobTracker's processHeartbeat() should not call System.currentTimeMillis() everytime
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3524
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3524
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Amar Kamat
>         Attachments: CurrentTimeCost.java
>
>
> Consider the following
> {code:title=JobTracker.java|borderStyle=solid}
> private synchronized boolean processHeartbeat(
>                                                 TaskTrackerStatus trackerStatus, boolean
initialContact) {
>     String trackerName = trackerStatus.getTrackerName();
>     trackerStatus.setLastSeen(System.currentTimeMillis());
> {code}
> Here, the call to {{System.currentTimeMillis()}} on every call to {{JobTracker.processHeartbeat()}}
might prove costly. While testing/benchmarking HADOOP-2119, we recorded that the JobTracker
was able to serve ~130 tasks/sec. So that means we might make ~130 calls to {{System.currentTimeMillis()}}
per second. I think in these cases (_last-seen-status_ etc) such a high level of accuracy
in terms of timestamp is unnecessary and hence can be avoided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message