Return-Path: X-Original-To: apmail-pig-dev-archive@www.apache.org Delivered-To: apmail-pig-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9742710AFC for ; Fri, 27 Dec 2013 19:53:50 +0000 (UTC) Received: (qmail 22486 invoked by uid 500); 27 Dec 2013 19:53:50 -0000 Delivered-To: apmail-pig-dev-archive@pig.apache.org Received: (qmail 22453 invoked by uid 500); 27 Dec 2013 19:53:50 -0000 Mailing-List: contact dev-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pig.apache.org Delivered-To: mailing list dev@pig.apache.org Received: (qmail 22444 invoked by uid 500); 27 Dec 2013 19:53:50 -0000 Delivered-To: apmail-hadoop-pig-dev@hadoop.apache.org Received: (qmail 22441 invoked by uid 99); 27 Dec 2013 19:53:50 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Dec 2013 19:53:50 +0000 Date: Fri, 27 Dec 2013 19:53:50 +0000 (UTC) From: "Cheolsoo Park (JIRA)" To: pig-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (PIG-3645) Replace Random with UUID in FileLocalizer.getTemporaryPath() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Cheolsoo Park created PIG-3645: ---------------------------------- Summary: Replace Random with UUID in FileLocalizer.getTemporaryPath() Key: PIG-3645 URL: https://issues.apache.org/jira/browse/PIG-3645 Project: Pig Issue Type: Improvement Components: impl Reporter: Cheolsoo Park Assignee: Cheolsoo Park Priority: Minor Fix For: 0.13.0 Currently, temporary paths are generated by FileLocalizer using Random.nextInt(). To provide strong randomness, MapReduceLauncher resets the Random object every time when compiling physical plan to MR plan: {code} MRCompiler comp = new MRCompiler(php, pc); comp.randomizeFileLocalizer(); // This in turn calls FileLocalizer.setR(new Random()). {code} Besides, there are a couple of places calling FileLocalizer.setR() (e.g. MRCompiler) with some random seed. I think- # Randomizing Random seed is unnecessary if we switch to UUID. # Setting Random objects in code like this is error-prone because it can be easily broken by having or missing a FileLocalizer.setR() somewhere else. See an example [here|http://search-hadoop.com/m/2nxTzQXfHw1]. So I propose that we remove all this "randomizing Random seed" code and use UUID instead in temporary paths. For unit tests that compare the results against gold files, we should still allow to set Random seed through FileLocalizer.setR(). But this method will be annotated as "VisibleForTesting" to ensure it is not used nowhere else other than in unit tests. Regarding the existing gold files, they can be easily regenerated by TestMRCompiler as follows- {code} FileOutputStream fos = new FileOutputStream(expectedFile + "_new"); PrintWriter pw = new PrintWriter(fos); pw.write(compiledPlan); {code} I assume there won't be any kind of regressions due to this change. But please let me know if I am wrong. -- This message was sent by Atlassian JIRA (v6.1.5#6160)