Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 2397 invoked from network); 28 Jun 2007 23:14:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 28 Jun 2007 23:14:26 -0000 Received: (qmail 87859 invoked by uid 500); 28 Jun 2007 23:14:28 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 87831 invoked by uid 500); 28 Jun 2007 23:14:28 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 87822 invoked by uid 99); 28 Jun 2007 23:14:28 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jun 2007 16:14:28 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jun 2007 16:14:24 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B8479714062 for ; Thu, 28 Jun 2007 16:14:04 -0700 (PDT) Message-ID: <12140486.1183072444751.JavaMail.jira@brutus> Date: Thu, 28 Jun 2007 16:14:04 -0700 (PDT) From: "Michael Bieniosek (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Updated: (HADOOP-1524) Task Logs userlogs don't show up for a while In-Reply-To: <7849369.1182629605853.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Bieniosek updated HADOOP-1524: -------------------------------------- Attachment: eliminate-split-idx.patch This patch eliminates the use of split.idx. Instead, get the information directly from the file system. > Task Logs userlogs don't show up for a while > --------------------------------------------- > > Key: HADOOP-1524 > URL: https://issues.apache.org/jira/browse/HADOOP-1524 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.13.0 > Reporter: Michael Bieniosek > Attachments: eliminate-split-idx.patch > > > When I start a task and go to the task logs, nothing shows up for a while. An examination of TaskLog.Writer and TaskLog.Reader reveals: > 1. The TaskLog.Reader relies on the presence of a split.idx to identify the parts of the logs to display. > 2. The TaskLog.Writer only updates the split.idx file when it moves on to the next log. > As a result, updates to the log only get pushed when an entire file is done. > Why is there a split.idx file? It seems that since files are called part-00000, part-00001, etc., the TaskLog.Reader can just look at all files and arrange them by alphabetical order. The split.idx file also contains file length, but this data is already stored by the filesystem. > If nobody has objections, I'd like to write a patch to eliminate the split.idx file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.