hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/Tutorial" by ZhengShao
Date Tue, 08 Dec 2009 06:14:37 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/Tutorial" page has been changed by ZhengShao.
http://wiki.apache.org/hadoop/Hive/Tutorial?action=diff&rev1=16&rev2=17

--------------------------------------------------

  
      * String to Double 
  
- Explicit type conversion can be done using the cast operator as shown in the Table of '''
Built in Functions ''' section below.
+ Explicit type conversion can be done using the cast operator as shown in the Table of '''
Built-in Operators and Functions ''' section below.
  
  === Complex Types ===
  Complex Types can be built up from primitive types and other composite types using:
@@ -79, +79 @@

  
  '''The tables with columns that are an instance of a complex type can only be created programmatically
and NOT through hive command line at this time'''. We will be adding ability to add such tables
through the hive command line in the future. 
  
- == Built in operators and functions ==
+ == Built-in operators and functions ==
- === Built in operators ===
-     *Relational Operators - The following operators compare the passed operands and generate
a TRUE or FALSE value depending on whether the comparison between the operands holds or not.

-  '''Relational Operators'''
- || Operator || Operand types ||  Description ||
- || A = B || all primitive types || TRUE if expression A is equal to expression B otherwise
FALSE||
- || A == B || none! || Fails; SQL uses = and not ==!||
- || A <> B || all primitive types || TRUE if expression A is NOT equal to expression
B otherwise FALSE||
- || A < B || all primitive types || TRUE if expression A is  less than expression B otherwise
FALSE||
- || A <= B || all primitive types || TRUE if expression A is less than or equal to expression
B otherwise FALSE||
- || A > B || all primitive types || TRUE if expression A is greater than expression B
otherwise FALSE||
- || A >= B || all primitive types || TRUE if expression A is greater than or equal to
expression B otherwise FALSE||
- || A IS NULL || all types || TRUE if expression A evaluates to NULL otherwise FALSE||
- || A IS NOT NULL || all types || TRUE if expression A evaluates to NULL otherwise FALSE||
- || A LIKE B || strings || TRUE if string A matches the SQL simple regular expression B,
otherwise FALSE. The comparison is done character by character. The _ character in B matches
any character in A(similar to . in posix regular expressions) while the % character in B matches
an arbitrary number of characters in A(similar to .* in posix regular expressions) e.g. 'foobar'
like 'foo' evaluates to FALSE where as 'foobar' like {{{'foo___'}}} evaluates to TRUE and
so does 'foobar' like 'foo%'||
- || NOT A LIKE B || strings || TRUE if string A not matches the SQL simple regular expression
B, otherwise FALSE||
- || A RLIKE B || strings ||  TRUE if string A matches the Java regular expression B(See [[http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html|Java
regular expressions syntax]]), otherwise FALSE e.g. 'foobar' rlike 'foo' evaluates to FALSE
where as 'foobar' rlike '^f.*r$' evaluates to TRUE||
- || A REGEXP B || strings || Same as RLIKE ||
-  
-     *Arithmetic Operators - The following operators support various common arithmetic operations
on the operands. All of them return number types. 
  
+ Hive follows MySQL in most of the operators (e.g. +, -, *, /, LIKE, RLIKE) and functions
(e.g. substr, year, hour).  For more information on built-in operators and functions, please
take a look at [[Hive/LanguageManual/UDF Hive Language Manual - UDF]].
- ''' Arithmetic Operators '''
- || Operator || Operand types || Description||
- || A + B || all number types || Gives the result of adding A and B. The type of the result
is the same as the common parent(in the type hierarchy) of the types of the operands. e.g.
since every integer is a float, therefore float is a containing type of integer so the + operator
on a float and an int will result in a float.||
- || A - B || all number types || Gives the result of subtracting B from A. The type of the
result is the same as the common parent(in the type hierarchy) of the types of the operands.||
- || A * B || all number types || Gives the result of multiplying A and B. The type of the
result is the same as the common parent(in the type hierarchy) of the types of the operands.
Note that if the multiplication causing overflow, you will have to cast one of the operators
to a type higher in the type hierarchy.||
- || A / B || all number types || Gives the result of dividing B from A. The type of the result
is the same as the common parent(in the type hierarchy) of the types of the operands. If the
operands are integer types, then the result is the quotient of the division.||
- || A % B || all number types || Gives the reminder resulting from dividing A by B. The type
of the result is the same as the common parent(in the type hierarchy) of the types of the
operands.||
- || A & B || all number types || Gives the result of bitwise AND of A and B. The type
of the result is the same as the common parent(in the type hierarchy) of the types of the
operands.||
- || A | B || all number types || Gives the result of bitwise OR of A and B. The type of the
result is the same as the common parent(in the type hierarchy) of the types of the operands.||
- || A ^ B || all number types || Gives the result of bitwise XOR of A and B. The type of
the result is the same as the common parent(in the type hierarchy) of the types of the operands.||
- || ~A || all number types || Gives the result of bitwise NOT of A. The type of the result
is the same as the type of A.||
- 
-     * Logical Operators - The following operators provide support for creating logical expressions.
All of them return boolean TRUE or FALSE depending upon the boolean values of the operands.
-  ''' Logical Operators '''
- || A AND B || boolean || TRUE if both A and B are TRUE, otherwise FALSE||
- || A && B || boolean || Same as A AND B||
- || A OR B || boolean || TRUE if either A or B or both are TRUE, otherwise FALSE||
- || {{{ A || B }}} || boolean || Same as A OR B||
- || NOT A || boolean || TRUE if A is FALSE, otherwise FALSE||
- || !A || boolean || Same as NOT A||
- 
- 
- * Operators on Complex Types - The following operators provide mechanisms to access elements
in Complex Types
- 
- ''' Operators on Complex Types '''
- || Operator || Operand types || Description||
- || A[n] || A is an Array and n is an int || returns the nth element in the array A. The
first element has index 0 e.g. if A is an array comprising of ['foo', 'bar'] then A[0] returns
'foo' and A[1] returns 'bar'||
- || M[key] || M is a Map<K, V> and key has type K || returns the value corresponding
to the key in the map e.g. if M is a map comprising of {'f' -> 'foo', 'b' -> 'bar',
'all' -> 'foobar'} then M['all'] returns 'foobar'||
- || S.x || S is a struct || returns the x field of S e.g for struct foobar {int foo, int
bar} foobar.foo returns the integer stored in the foo field of the struct.||
- 
- === Built in functions ===
- *The following built in functions are supported in hive:
- [[http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java?view=markup|List
of functions in source code: FunctionRegistry.java]]
- 
- ''' Built in Functions '''
- || Return Type || Name(Signature) || Description||
- || BIGINT || round(double a) || returns the rounded BIGINT value of the double||
- || BIGINT || floor(double a) || returns the maximum BIGINT value that is equal or less than
the double||
- || BIGINT || ceil(double a) || returns the minimum BIGINT value that is equal or greater
than the double||
- || double || rand(), rand(int seed) || returns a random number (that changes from row to
row). Specifiying the seed will make sure the generated random number sequence is deterministic.||
- || string || concat(string A, string B)|| returns the string resulting from concatenating
B after A e.g. concat('foo', 'bar') results in 'foobar'||
- || string || substr(string A, int start) || returns the substring of A starting from start
position till the end of string A e.g. substr('foobar', 4) results in 'bar'||
- || string || substr(string A, int start, int length) || returns the substring of A starting
from start position with the given length e.g. substr('foobar', 4, 2) results in 'ba'||
- || string || upper(string A)|| returns the string resulting from converting all characters
of A to upper case e.g. upper('fOoBaR') results in 'FOOBAR'||
- || string || ucase(string A) || Same as upper||
- || string || lower(string A) || returns the string resulting from converting all characters
of B to lower case e.g. lower('fOoBaR') results in 'foobar'||
- || string || lcase(string A) || Same as lower||
- || string || trim(string A) || returns the string resulting from trimming spaces from both
ends of A e.g. trim('   foobar   ') results in 'foobar'||
- || string || ltrim(string A) || returns the string resulting from trimming spaces from the
beginning(left hand side) of A e.g. ltrim('   foobar   ') results in 'foobar   '||
- || string || rtrim(string A) || returns the string resulting from trimming spaces from the
end(right hand side) of A e.g. rtrim('   foobar   ') results in '   foobar'||
- || string || regexp_replace(string A, string B, string C) || returns the string resulting
from replacing all substrings in B that match the Java regular expression syntax(See [[http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html|Java
regular expressions syntax]]) with C e.g. regexp_replace('foobar', 'oo<nowiki>|</nowiki>ar',
'') returns 'fb'||
- || int || size(Map<K.V>) || returns the number of elements in the map type||
- || int || size(Array<T>) || returns the number of elements in the array type||
- || <type> || cast(expr as <type>) || converts the results of the expression
expr to <type> e.g. cast('1' as BIGINT) will convert the string '1' to it integral representation.
A null is returned if the conversion does not succeed.||
- || string || from_unixtime(int unixtime) || convert the number of seconds from unix epoch
(1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current
system time zone in the format of "1970-01-01 00:00:00"||
- || string || to_date(string timestamp) || Return the date part of a timestamp string: to_date("1970-01-01
00:00:00") = "1970-01-01"||
- || int || year(string date) || Return the year part of a date or a timestamp string: year("1970-01-01
00:00:00") = 1970, year("1970-01-01") = 1970||
- || int || month(string date) || Return the month part of a date or a timestamp string: month("1970-11-01
00:00:00") = 11, month("1970-11-01") = 11||
- || int || day(string date) || Return the day part of a date or a timestamp string: day("1970-11-01
00:00:00") = 1, day("1970-11-01") = 1||
- ||string ||get_json_object(string json_string, string path) ||Extract json object from a
json string based on json path specified, and return json string of the extracted json object.
It will return null if the input json string is invalid||
- 
- *The following built in aggregate functions are supported in hive:
- 
-  ''' Built in Aggregate Functions '''
- || Return Type || Name(Signature) || Description||
- || BIGINT || count(1), count(DISTINCT col [, col]...)|| count(1) returns the number of members
in the group, whereas the count(DISTINCT col) gets the count of distinct values of the columns
in the group||
- || DOUBLE || sum(col), sum(DISTINCT col) || returns the sum of the elements in the group
or the sum of the distinct values of the column in the group||
- || DOUBLE || avg(col), avg(DISTINCT col) || returns the average of the elements in the group
or the average of the distinct values of the column in the group||
- || DOUBLE || min(col) || returns the minimum value of the column in the group||
- || DOUBLE || max(col) || returns the maximum value of the column in the group||
  
  == Language capabilities ==
  Hive query language provides the basic SQL like operations. These operations work on tables
or partitions. These operations are:

Mime
View raw message