hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Dere <jd...@hortonworks.com>
Subject Re: Hive Custom UDF evaluate behavior when @UDFType is set
Date Tue, 10 Apr 2018 18:32:29 GMT
Might have to do with constant propagation because the function was listed as deterministic.
You can try logging the stack trace during execution and pasting both stack traces here, may
help give more clues as to what is going on.


________________________________
From: PradeepKumar Yadav <Pradeep.Yadav@protegrity.com>
Sent: Monday, April 9, 2018 11:35 PM
To: user@hive.apache.org
Subject: Hive Custom UDF evaluate behavior when @UDFType is set

Hi,
                Recently while creating a custom generic hive UDF I came across a different
behavior for the Evaluate method. The custom UDF had a logic to increment the counter and
write it to a file. Now when I execute it directly without involving any table it always returns
an extra count i.e. 2.
                Now when I added some logs to inside the evaluate method I observed that the
logs (sysout) were printed twice. Now on further research I came across the @UDFType annotation
and found out that if we do not provide this annotation in our custom UDF, default value is
deterministic true.
                When I provide this annotation in my custom UDF and set @UDFType( deterministic
= false ), I observed that my logs were printed only once and my UDF was returning the accurate
count i.e. 1 therefore implying my evaluate was called only once when @UDFType( deterministic
= false ).
                Now I wanted to understand what is the connection between @UDFType and Evaluate
method when UDF is invoked directly without a table.

                Note : When I invoke my UDF on a table I get the appropriate count even with
@UDFType( deterministic = true ).

                Thanks in advance. :)
Regards,
PradeepKumar Yadav

Mime
View raw message