Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by hliu:
http://wiki.apache.org/hadoop/Hive/HiveQL/UDF

[[TableOfContents]]
+ == Built in operators ==
+ Relational Operators  The following operators compare the passed operands and generate
a TRUE or FALSE value depending on whether the comparison between the operands holds or not.
+ <10%>'''Operator'''<10%>'''Operand types''''''Description'''
+  A = B  all primitive types  TRUE if expression A is equal to expression B otherwise
FALSE 
+ A == Bnone!Fails; SQL uses = and not ==!
+ A <> Ball primitive typesTRUE if expression A is NOT equal to expression B otherwise
FALSE
+ A < Ball primitive typesTRUE if expression A is less than expression B otherwise
FALSE
+ A <= Ball primitive typesTRUE if expression A is less than or equal to expression
B otherwise FALSE
+ A > Ball primitive typesTRUE if expression A is greater than expression B otherwise
FALSE
+ A >= Ball primitive typesTRUE if expression A is greater than or equal to expression
B otherwise FALSE
+ A IS NULLall typesTRUE if expression A evaluates to NULL otherwise FALSE
+ A IS NOT NULLall typesTRUE if expression A evaluates to NULL otherwise FALSE
+ A LIKE BstringsTRUE if string A matches the SQL simple regular expression B, otherwise
FALSE. The comparison is done character by character. The _ character in B matches any character
in A(similar to . in posix regular expressions) while the % character in B matches an arbitrary
number of characters in A(similar to .* in posix regular expressions) e.g. 'foobar' like 'foo'
evaluates to FALSE where as 'foobar' like 'foo___' evaluates to TRUE and so does 'foobar'
like 'foo%'
+ A RLIKE BstringsTRUE if string A matches the Java regular expression B(See Java regular
expressions syntax), otherwise FALSE e.g. 'foobar' rlike 'foo' evaluates to FALSE where as
'foobar' rlike '^f.*r$' evaluates to TRUE
+ A REGEXP BstringsSame as RLIKE
+
+ Arithmetic Operators  The following operators support various common arithmetic operations
on the operands. All of them return number types.
+ <10%>'''Operator'''<10%>'''Operand types''''''Description'''
+ A + B all number types Gives the result of adding A and B. The type of the result
is the same as the common parent(in the type hierarchy) of the types of the operands. e.g.
since every integer is a float, therefore float is a containing type of integer so the + operator
on a float and an int will result in a float.
+ A  B all number types Gives the result of subtracting B from A. The type of the result
is the same as the common parent(in the type hierarchy) of the types of the operands.
+ A * B all number types Gives the result of multiplying A and B. The type of the result
is the same as the common parent(in the type hierarchy) of the types of the operands. Note
that if the multiplication causing overflow, you will have to cast one of the operators to
a type higher in the type hierarchy.
+ A / B all number types Gives the result of dividing B from A. The type of the result
is the same as the common parent(in the type hierarchy) of the types of the operands. If the
operands are integer types, then the result is the quotient of the division.
+ A % B all number types Gives the reminder resulting from dividing A by B. The type
of the result is the same as the common parent(in the type hierarchy) of the types of the
operands.
+ A & B all number types Gives the result of bitwise AND of A and B. The type of
the result is the same as the common parent(in the type hierarchy) of the types of the operands.
+ A  B all number types Gives the result of bitwise OR of A and B. The type of the
result is the same as the common parent(in the type hierarchy) of the types of the operands.
+ A ^ B all number types Gives the result of bitwise XOR of A and B. The type of the
result is the same as the common parent(in the type hierarchy) of the types of the operands.
+ ~A all number types Gives the result of bitwise NOT of A. The type of the result is
the same as the type of A. 
+
+ Logical Operators  The following operators provide support for creating logical expressions.
All of them return boolean TRUE or FALSE depending upon the boolean values of the operands.
+ <10%>'''Operator'''<10%>'''Operand types''''''Description
+ A AND B boolean TRUE if both A and B are TRUE, otherwise FALSE
+ A && B boolean Same as A AND B
+ A OR B boolean TRUE if either A or B or both are TRUE, otherwise FALSE
+ A   B boolean Same as A OR B
+ NOT A boolean TRUE if A is FALSE, otherwise FALSE
+ ! A boolean Same as NOT A 
+
+ Operators on Complex Types  The following operators provide mechanisms to access elements
in Complex Types
+ <10%>'''Operator'''<10%>'''Operand types''''''Description'''
+ A[n] A is an Array and n is an int returns the nth element in the array A. The first
element has index 0 e.g. if A is an array comprising of ['foo', 'bar'] then A[0] returns 'foo'
and A[1] returns 'bar'
+ M[key] M is a Map<K, V> and key has type K returns the value corresponding to
the key in the map e.g. if M is a map comprising of {'f' > 'foo', 'b' > 'bar', 'all'
> 'foobar'} then M['all'] returns 'foobar'
+ S.x S is a struct returns the x field of S e.g for struct foobar {int foo, int bar}
foobar.foo returns the integer stored in the foo field of the struct. 
+
+ The following built in functions are supported in hive:
+ <10%>'''Return Type'''<10%>'''Name(Signature)''''''Description
+ BIGINT round(double a) returns the rounded BIGINT value of the double
+ BIGINT floor(double a) returns the maximum BIGINT value that is equal or less than
the double
+ BIGINT ceil(double a) returns the minimum BIGINT value that is equal or greater than
the double
+ double rand(), rand(int seed) returns a random number (that changes from row to row).
Specifiying the seed will make sure the generated random number sequence is deterministic.
+ string concat(string A, string B)returns the string resulting from concatenating B
after A e.g. concat('foo', 'bar') results in 'foobar'
+ string substr(string A, int start) returns the substring of A starting from start
position till the end of string A e.g. concat('foobar', 3) results in 'bar'
+ string upper(string A)returns the string resulting from converting all characters
of A to upper case e.g. upper('fOoBaR') results in 'FOOBAR'
+ string ucase(string A) Same as upper
+ string lower(string A) returns the string resulting from converting all characters
of B to lower case e.g. lower('fOoBaR') results in 'foobar'
+ string lcase(string A) Same as lower
+ string trim(string A) returns the string resulting from trimming spaces from both
ends of A e.g. trim(' foobar ') results in 'foobar'
+ string ltrim(string A) returns the string resulting from trimming spaces from the
beginning(left hand side) of A e.g. ltrim(' foobar ') results in 'foobar '
+ string rtrim(string A) returns the string resulting from trimming spaces from the
end(right hand side) of A e.g. rtrim(' foobar ') results in ' foobar'
+ string regexp_replace(string A, string B, string C) returns the string resulting from
replacing all substrings in B that match the Java regular expression syntax(See Java regular
expressions syntax) with C e.g. regexp_replace('foobar', 'ooar', '') returns 'fb'
+ int size(Map<K.V>) returns the number of elements in the map type
+ int size(Array<T>) returns the number of elements in the array type
+ <type> cast(expr as <type>) converts the results of the expression expr
to <type> e.g. cast('1' as BIGINT) will convert the string '1' to it integral representation.
A null is returned if the conversion does not succeed.
+ string from_unixtime(int unixtime) convert the number of seconds from unix epoch (19700101
00:00:00 UTC) to a string representing the timestamp of that moment in the current system
time zone in the format of "19700101 00:00:00"
+ string to_date(string timestamp) Return the date part of a timestamp string: to_date("19700101
00:00:00") = "19700101"
+ int year(string date) Return the year part of a date or a timestamp string: year("19700101
00:00:00") = 1970, year("19700101") = 1970
+ int month(string date) Return the month part of a date or a timestamp string: month("19701101
00:00:00") = 11, month("19701101") = 11
+ int day(string date) Return the day part of a date or a timestamp string: day("19701101
00:00:00") = 1, day("19701101") = 1 
+
+ The following built in aggregate functions are supported in hive:
+ <10%>'''Return Type'''<10%>'''Name(Signature)''''''Description
+ BIGINT count(1), count(DISTINCT col [, col]...)count(1) returns the number of members
in the group, whereas the count(DISTINCT col) gets the count of distinct values of the columns
in the group
+ DOUBLE sum(col), sum(DISTINCT col) returns the sum of the elements in the group or
the sum of the distinct values of the column in the group
+ DOUBLE avg(col), avg(DISTINCT col) returns the average of the elements in the group
or the average of the distinct values of the column in the group
+ DOUBLE min(col) returns the minimum of the column in the group
+ DOUBLE max(col) returns the maximum value of the column n the group
+
+
+
+
+
+
== UDF supported by Hive ==
 UDF are built in functions in Hive. Hive supports two type of UDFs: generic built in functions,
and built in aggregate functions.
+ UDF are built in functions in Hive. Hive supports two type of UDFs: generic built in functions,
and built in aggregate (GROUP BY) functions.
 === Built in Aggregate Functions ===
+ === Generic built in Functions ===
+
+ ==== count ====
+ Returns the number of members in the group, whereas the count(DISTINCT col) gets the count
of distinct values of the columns in the group. The result is a BIGINT value. The DISTINCT
option can be used to return the average of the distinct values of col.
+ {{{
+ count(1), count(DISTINCT col [, col]...)
+ }}}
+
+
+ === Built in Aggregate (GROUP BY) Functions ===
+
+ ==== count ====
+ Returns the number of members in the group, whereas the count(DISTINCT col) gets the count
of distinct values of the columns in the group. The result is a BIGINT value. The DISTINCT
option can be used to return the average of the distinct values of col.
+ {{{
+ count(1), count(DISTINCT col [, col]...)
+ }}}
+
+ ==== sum ====
+ Returns the sum of the elements in the group or the sum of the distinct values of the column
in the group. The result is a DOUBLE value.
+ {{{
+ sum(col), sum(DISTINCT col)
+ }}}
+
+ ==== avg ====
+ Returns the average of the elements in the group or the average of the distinct values of
the column in the group. The result is a DOUBLE value.
+ {{{
+ count(1), count(DISTINCT col [, col]...)
+ }}}
+
+ ==== min ====
+ Returns the minimum of the column in the group. The result is a DOUBLE value.
+ {{{
+ min(col)
+ }}}
+
+ ==== max ====
+ Returns the maximum of the column in the group. The result is a DOUBLE value.
+ {{{
+ max(col)
+ }}}
+
