Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1405D64A3 for ; Tue, 12 Jul 2011 05:13:31 +0000 (UTC) Received: (qmail 38916 invoked by uid 500); 12 Jul 2011 05:13:30 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 38716 invoked by uid 500); 12 Jul 2011 05:13:25 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 38694 invoked by uid 99); 12 Jul 2011 05:13:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jul 2011 05:13:21 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jul 2011 05:13:20 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 29C024A0A9 for ; Tue, 12 Jul 2011 05:13:00 +0000 (UTC) Date: Tue, 12 Jul 2011 05:13:00 +0000 (UTC) From: "Amar Kamat (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <815410613.5025.1310447580167.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Assigned] (MAPREDUCE-778) [Rumen] Need a standalone JobHistory log anonymizer MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat reassigned MAPREDUCE-778: ------------------------------------ Assignee: Amar Kamat > [Rumen] Need a standalone JobHistory log anonymizer > --------------------------------------------------- > > Key: MAPREDUCE-778 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-778 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: tools/rumen > Reporter: Hong Tang > Assignee: Amar Kamat > Labels: anonymization, rumen > Attachments: anonymizer.patch, anonymizer.py, same.py > > > Job history logs contain a rich set of information that can help understand and characterize cluster workload and individual job execution. Examples of work that parses or utilizes job history include HADOOP-3585, MAPREDUCE-534, HDFS-459, MAPREDUCE-728, and MAPREDUCE-776. Some of the parsing tools developed in previous work already contains a component to anonymize the logs. It would be nice to combine these effort and have a common standalone tool that can anonymizes job history logs and preserve much of the structure of the files so that existing tools on top of job history logs continue work with no modification. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira