hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-867) Add add UDFs found in mysq
Date Tue, 20 Oct 2009 03:55:59 GMT

    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767678#action_12767678
] 

Zheng Shao commented on HIVE-867:
---------------------------------

If you almost always need a String parameter, we can just use "String" as the type of the
parameter in the UDF definition.
If you almost always need to return a String, we can also just return "String".

So for UDFLeft and UDFRight, we can do:
{code}
  public String evaluate(String s, IntWritable r);
{code}
instead of
{code}
  public Text evaluate(Text s, IntWritable r);
{code}

This will save a lot of conversions if user do "left(right(col, 10), 3)".

This is the same for the SerDe - for example, RegexSerDe returns "String" instead of "Text",
so "left(col, 3)" where col is from a RegexSerDe table does not need a conversion from "String"
-> "Text" to pass to the Left function, and then "Text" -> "String" inside the left
function.

Of course, the most efficient way is to do the char counting without UTF-8 encoding/decoding,
(then we still prefer Text because we don't need to create new objects), but I think we can
do that later unless you want to do it now.

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message