hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Krishna Ramachandran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-323) Improve the way job history files are managed
Date Wed, 25 Aug 2010 01:52:21 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902271#action_12902271
] 

Krishna Ramachandran commented on MAPREDUCE-323:
------------------------------------------------

few comments to start with:

*JobHistory.java*

{quote}
  private static final SortedMap<Long, String>jobToDirectoryMap
    = new TreeMap<Long, String>();
{quote}

how is this used?

{quote}
  public String getConfFilePath(JobID jobId) {
    MovedFileInfo info = jobHistoryFileMap.get(jobId);
    if (info == null) {
      return null;
    }
    final Path historyFileDir
      = (new Path(getHistoryFilePath(jobId))).getParent();
    return getConfFile(historyFileDir, jobId).toString();
  }
{quote}

instead "info" has this data? 
_info.historyFile_ ?

suggest simple modification to
setupEventWriter

{quote}
  public void setupEventWriter(JobID jobId, JobConf jobConf)
  throws IOException {
    if (logDir == null) {
      LOG.info("Log Directory is null, returning");
      throw new IOException("Missing Log Directory for History");
    }
    MetaInfo oldFi = fileMap.get(jobId);

    long submitTime = (oldFi == null ? System.currentTimeMillis() : oldFi.submitTime);

    String user = getUserName(jobConf);
    String jobName = getJobName(jobConf);
....
{quote}

On ThreadPoolExecutor - why increased pool size?

{quote}
canonicalHistoryLogDir(JobId,...)
{quote}

jobId is not used in the following

{quote}
canonicalHistoryLogDir(
{quote}

In this block

{quote}
synchronized (ueState) {
......
+            iShouldMonitor = true;
+
+            ueState.unindexedElements = new LinkedList<JobHistoryIndexElement>();
+            ueState.currentDoneSubdirectory = resultDir;
+
+            ueState.monitoredDirectory = resultDir;
.....
+          ueState.unindexedElements.
+            add(new JobHistoryIndexElement(millisecondTime, id, metaInfo));

{quote}

This code is not enitrely clear. 
should we  increment the count here? unindexedElementCount++
_unindexedElements_ 

related item:
get/addUnindexedElements() - who calls these? 

In 
class UnindexedElementsState.closeCurrentDirectory()

{quote}
      OutputStream newIndexOStream = null;
      PrintStream newIndexPStream = null;
{quote}

are unused

{quote}
+        // time, because iShouldMonitor is only set true when
+        // ueState.monitoredDirectory changes, which will force the
+        // current incumbent to abend at the earliest opportunity.
+        while (iShouldMonitor) {
+          int roundCounter = 0;
+
+          int interruptionsToAbort = 2;
+
+          try {
+            Thread.sleep(1000);
+          } catch (InterruptedException e) {
+            if (--interruptionsToAbort == 0) {
+              return;
+            }
+          }
+
+          synchronized (ueState) {
+            if (ueState.monitoredDirectory != resultDir) {
+              // someone else closed out the directory I was monitoring
+              iShouldMonitor = false;
+            } else if (++roundCounter % 30 == 0) {
+              interruptionsToAbort = 2;
+

{quote}

is in a busy wait loop with an arbitrary 1 sec sleep.  This check can go up to a maximum of
1 hour? 
The 5 minute checkpoint does not set anything?

{quote}
              } else if (++roundCounter % 300 == 0) {
                // called for side effect -- a 5 minute checkpoint
                try {
                  ueState.getACurrentIndex(ueState.currentDoneSubdirectory);  // why?
                } catch (IOException e) {
                  LOG.warn("Couldn't build an interim Job History index for "
                         + ueState.currentDoneSubdirectory);
                }                  

{quote}










> Improve the way job history files are managed
> ---------------------------------------------
>
>                 Key: MAPREDUCE-323
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-323
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Amar Kamat
>            Assignee: Dick King
>            Priority: Critical
>         Attachments: MR323--2010-08-20--1533.patch
>
>
> Today all the jobhistory files are dumped in one _job-history_ folder. This can cause
problems when there is a need to search the history folder (job-recovery etc). It would be
nice if we group all the jobs under a _user_ folder. So all the jobs for user _amar_ will
go in _history-folder/amar/_. Jobs can be categorized using various features like _jobid,
date, jobname_ etc but using _username_ will make the search much more efficient and also
will not result into namespace explosion. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message