Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9BB4A200B7C for ; Thu, 8 Sep 2016 17:33:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 91857160ABD; Thu, 8 Sep 2016 15:33:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D9115160AAD for ; Thu, 8 Sep 2016 17:33:45 +0200 (CEST) Received: (qmail 52491 invoked by uid 500); 8 Sep 2016 15:33:44 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 52480 invoked by uid 99); 8 Sep 2016 15:33:44 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Sep 2016 15:33:44 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 578FBE0230; Thu, 8 Sep 2016 15:33:44 +0000 (UTC) From: arina-ielchiieva To: dev@drill.apache.org Reply-To: dev@drill.apache.org References: In-Reply-To: Subject: [GitHub] drill pull request #574: DRILL-4726: Dynamic UDFs support Content-Type: text/plain Message-Id: <20160908153344.578FBE0230@git1-us-west.apache.org> Date: Thu, 8 Sep 2016 15:33:44 +0000 (UTC) archived-at: Thu, 08 Sep 2016 15:33:46 -0000 Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/574#discussion_r78030815 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFunctionRegistry.java --- @@ -64,62 +76,134 @@ .put("CONVERT_FROM", Pair.of(2, 2)) .put("FLATTEN", Pair.of(1, 1)).build(); + /** Registers all functions present in Drill classpath on start-up. All functions will be marked as built-in.*/ public DrillFunctionRegistry(ScanResult classpathScan) { + validate(BUILT_IN, classpathScan); + register(BUILT_IN, classpathScan, this.getClass().getClassLoader()); + if (logger.isTraceEnabled()) { + StringBuilder allFunctions = new StringBuilder(); + for (DrillFuncHolder method: registryHolder.getAllFunctionsWithHolders().values()) { + allFunctions.append(method.toString()).append("\n"); + } + logger.trace("Registered functions: [\n{}]", allFunctions); + } + } + + /** + * Validates all functions, present in jars. + * Will throw {@link FunctionValidationException} if: + * 1. Jar with the same name has been already registered. + * 2. Conflicting function with the similar signature is found. + * 3. Aggregating function is not deterministic. + * + * @return list of validated functions + */ + public List validate(String jarName, ScanResult classpathScan) { + List functions = Lists.newArrayList(); FunctionConverter converter = new FunctionConverter(); List providerClasses = classpathScan.getAnnotatedClasses(); - // Hash map to prevent registering functions with exactly matching signatures - // key: Function Name + Input's Major Type - // value: Class name where function is implemented - // - final Map functionSignatureMap = new HashMap<>(); + if (registryHolder.containsJar(jarName)) { + throw new FunctionValidationException(String.format("Jar %s is already registered", jarName)); + } --- End diff -- As you noted, built-in functions creation is save here, since they are registered at start up. The race condition you are talking about is handled by remote registry versioning (thus by Zoookeeper itself). As you know that we have two validation steps: local and remote. So this method is responsible for local validation. Let's say we have: Thread1 that registers Jar1 where F1(VARCHAR-REQUIRED) is present Thread2 that registers Jar2 where F1(VARCHAR-REQUIRED) is present Since F1(VARCHAR-REQUIRED) is absent in LOCAL function registry, both threads pass local validation successfully. Then they start remote validation. Each thread retrieves remote function registry with version 1. Since F1(VARCHAR-REQUIRED) is absent in REMOTE function registry, both threads pass remote validation successfully. Then each thread updates remote function registry and tries to send it to Zookeeper. This part is controlled by Zookeeper, eventually one thread will send updated remote registry in Zookeeper first. and remote registry version will change to 2. So the other thread will get VersionMismatchException. In this case such thread will load remote registry with version 2 and execute remote validation again during which it will detect duplicates and send appropriate response to the user. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---