hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4059) The history server should have a separate pluggable storage/query interface
Date Wed, 04 Apr 2012 14:01:22 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246263#comment-13246263

Robert Joseph Evans commented on MAPREDUCE-4059:

Yes it is better to use == instead of .equals when comparing enums.  I was copying the behavior
of the previous code that was part of the web service.

The HistoryFileManager manages History files.  It is ideally the only one that can interact
with them for thread safety reasons.  These concurrency issues are addressed in a follow on

The data from the history files could be stored some place else in the future, but it will
always start out being written to HDFS.  It just may be transfered to another location after
it is written to HDFS.  Initially OtherStorages are likely to act like a cache and provide
faster access to data already stored in HDFS.  Once issues with that are worked out.  And
if the community feels comfortable with doing so it may change to storing data that is not
backed by files in HDFS.
> The history server should have a separate pluggable storage/query interface
> ---------------------------------------------------------------------------
>                 Key: MAPREDUCE-4059
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4059
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 0.24.0, 0.23.3
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>         Attachments: MR-4059.txt, MR-4059.txt, MR-4059.txt, MR-4059.txt
> The history server currently caches all parsed jobs in RAM.  These jobs can be very large
because of counters.  It would be nice to have a pluggable interface for the cacheing and
querying of the cached data so that we can play around with different implementations.  Also
just for cleanness of the code it would be nice to split the very large JobHistoryServer.java
into a few smaller ones that are more understandable and readable.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message