drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4963) Issues when overloading Drill native functions with dynamic UDFs
Date Mon, 23 Jan 2017 13:54:26 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834526#comment-15834526
] 

ASF GitHub Bot commented on DRILL-4963:
---------------------------------------

Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/701
  
    @paul-rogers as we discussed to have tried to find the way to preserve lazy-init approach.
    I have renamed PR to reflect latest changes. Please find new solution and description
of changes below:
     
    Lazy-init was performed only when function was not found during Calcite parsing but DRILL-4963
shows different cases when Calcite parsing can pass (usually during function overloading)
but still function is not found. To handle such cases, we need to enhance lazy-init process:
    1. Lazy-init process should be more light-weight. Currently when function is not found,
we load all jars from remote function registry and compare with jars from local function registry.
It's not optimal especially when both registries are in sync. To improve performance I have
introduced remove function registry version which can be used to check if we need to sync
remote and local registries prior to checking jars and functions.
    2. During parsing stage we were only catching Calcite parsing exception but function not
found can be also indicated by Drill function error and so on. Not to be engaged into enumerating
possible exceptions (which can added in the code later), we are checking if remote and local
function registries are in sync on any error. Such check may only affect on queries that will
fail anyway. So if failure time will take a little bit longer, it won't make significant difference.
    3. During execution stage Drill attempts to find matching function from the list of functions
with the same names. `DefaultFunctionResolver.getBestMatch()` does not do exact match, it
may return function with different input parameters types. Best match is found according to
rules described in `TypeCastRules.class`. Currently we attempt to sync remote and local function
registries only if best matching function was found but it is not correct since even if Drill
finds the best matching function among current functions but does not mean that remote function
registry does not hold even better matching function. To fix this issue we would first try
to find function using `ExactFunctionResolver.getBestMatch()` and if exactly matching function
is not found, we'll check if remote and local function registries are in sync and then use
`DefaultFunctionResolver.getBestMatch()` to find the best matching function. But if exactly
matching function is found, we'll return it right away without any registries sync checks.
    
    Changes:
    1. Add `consists` method to PersistentStore interface which can return true if key exists
in store, false otherwise. This method is needed to return only remote function registry version
without its content (unlike method `get`). We'll pull remote function registry content only
if versions are different.
    2. Added check if remote and local function registries are in sync on any failure during
planning stage
    and on exact matching function not found during execution stage.
    3. Added additional debug messages for `CreateFunctionHandler` and `DropFunctionHandler`.
    4. Updated unit tests to reflect new changes.



> Issues when overloading Drill native functions with dynamic UDFs
> ----------------------------------------------------------------
>
>                 Key: DRILL-4963
>                 URL: https://issues.apache.org/jira/browse/DRILL-4963
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.9.0
>            Reporter: Roman
>            Assignee: Arina Ielchiieva
>             Fix For: Future
>
>         Attachments: subquery_udf-1.0.jar, subquery_udf-1.0-sources.jar, test_overloading-1.0.jar,
test_overloading-1.0-sources.jar
>
>
> I created jar file which overloads 3 DRILL native functions (LOG(VARCHAR-REQUIRED), CURRENT_DATE(VARCHAR-REQUIRED)
and ABS(VARCHAR-REQUIRED,VARCHAR-REQUIRED)) and registered it as dynamic UDF.
> If I try to use my functions I will get errors:
> {code:xml}
> SELECT CURRENT_DATE('test') FROM (VALUES(1));
> {code}
> Error: FUNCTION ERROR: CURRENT_DATE does not support operand types (CHAR)
> SQL Query null
> {code:xml}
> SELECT ABS('test','test') FROM (VALUES(1));
> {code}
> Error: FUNCTION ERROR: ABS does not support operand types (CHAR,CHAR)
> SQL Query null
> {code:xml}
> SELECT LOG('test') FROM (VALUES(1));
> {code}
> Error: SYSTEM ERROR: DrillRuntimeException: Failure while materializing expression in
constant expression evaluator LOG('test').  Errors: 
> Error in expression at index -1.  Error: Missing function implementation: castTINYINT(VARCHAR-REQUIRED).
 Full expression: UNKNOWN EXPRESSION.
> But if I rerun all this queries after "DrillRuntimeException", they will run correctly.
It seems that Drill have not updated the function signature before that error. Also if I add
jar as usual UDF (copy jar to /drill_home/jars/3rdparty and restart drillbits), all queries
will run correctly without errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message