hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Hanson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
Date Sat, 24 Aug 2013 00:13:51 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749189#comment-13749189
] 

Eric Hanson commented on HIVE-4961:
-----------------------------------

Completed working version of bridge to allow custom UDFs that are subclasses
of UDF to work in vectorized mode. This supports UDFs with evaluate() methods
that take and return boxed types (e.g. Long), Writable types (e.g. LongWritable)
and standard types (e.g. long). Generic UDFs are not supported. That will be the 
subject of a future patch.

I did manual testing for a large set of UDFs taking and returning the types supported
by vectorization: tinyint, smallint, int, bigint, float, double, boolean, string, timestamp.

UDFs one argument and multiple arguments were tested. Both constant and variable arguments
were tested.

Including the tests with the patch, or doing another patch with end-to-end tests, is yet to
be done.
                
> Create bridge for custom UDFs to operate in vectorized mode
> -----------------------------------------------------------
>
>                 Key: HIVE-4961
>                 URL: https://issues.apache.org/jira/browse/HIVE-4961
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Eric Hanson
>            Assignee: Eric Hanson
>         Attachments: vectorUDF.4.patch, vectorUDF.5.patch
>
>
> Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of
this JIRA is to create a facility where if you run a query that uses myUDF() in an expression,
the query will run in vectorized mode.
> This would be a general-purpose bridge for custom UDFs that users add to Hive. It would
work with existing UDFs.
> I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized
from the beginning, to optimize performance. That is not covered by this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message