hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bhallamudi Venkata Siva Kamesh (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4059) The history server should have a separate pluggable storage/query interface
Date Fri, 30 Mar 2012 17:54:29 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242600#comment-13242600
] 

Bhallamudi Venkata Siva Kamesh commented on MAPREDUCE-4059:
-----------------------------------------------------------

Hi Robert,
I just have gone through a portion of the patch and have few comments/doubts on the patch.


1. {code:title=CachedHistoryStorage.java|borderStyle=solid}
    if(offset == null || offset < 0) offset = 0l;
    if(count == null) count = Long.MAX_VALUE;
    
    long at = 0;
    long end = offset + count - 1;
    LOG.error("Looking for entries starting at "+offset+" with a length of "+count+" so ending
at "+end);
    for (Job job : jobs) {
      LOG.error("Looking at job END: "+at+" <= "+end);
      if(at > end) {
        break;
      }
{code}

Suppose say, in the above code offset is set as 100 and count is not initialized, so count
will be initialized as Long.MAX_VALUE.
If so, *end* may get negative value (because of integer overflow) and hence for loop will
be exited in the first iteration itself.
But I think, we should return entries starting from 100th entry.

Am I missing anything here?

2. I think it is better to add sanity check like this in CachedHistoryStorage#getPartialJobs()
{noformat}
if(offset > jobs.size()) {
    	return allJobs;
    }
{noformat}
                
> The history server should have a separate pluggable storage/query interface
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4059
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4059
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 0.24.0, 0.23.3
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>         Attachments: MR-4059.txt, MR-4059.txt
>
>
> The history server currently caches all parsed jobs in RAM.  These jobs can be very large
because of counters.  It would be nice to have a pluggable interface for the cacheing and
querying of the cached data so that we can play around with different implementations.  Also
just for cleanness of the code it would be nice to split the very large JobHistoryServer.java
into a few smaller ones that are more understandable and readable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message