hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-2279) Implement sort(array) UDF
Date Mon, 09 Jan 2012 21:45:40 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182826#comment-13182826
] 

Phabricator commented on HIVE-2279:
-----------------------------------

zhenxiao has commented on the revision "HIVE-2279 [jira] Implement sort(array) UDF".

  sort() is a better name for sort_array(), while, seems currently the parser/semantic analyzer
has some problem taking a reserved keyword as UDF function name.

  I tried the following changes in HIve.g:

  [~/Code/hive]git diff ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g b/ql/src/java/org/apache/hadoop/h
  index 888bf47..ec256de 100644
  --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  @@ -1816,7 +1816,7 @@ functionName
   @init { msgs.push("function name"); }
   @after { msgs.pop(); }
       : // Keyword IF is also a function name
  -    Identifier | KW_IF | KW_ARRAY | KW_MAP | KW_STRUCT | KW_UNIONTYPE
  +    Identifier | KW_IF | KW_ARRAY | KW_MAP | KW_STRUCT | KW_UNIONTYPE | KW_SORT
       ;

   castExpression
  @@ -2091,6 +2091,7 @@ sysFuncNames
       | KW_MAP
       | KW_STRUCT
       | KW_UNIONTYPE
  +    | KW_SORT
       | EQUAL
       | NOTEQUAL
       | LESSTHANOREQUALTO

  While, the testcase always fails during semantic analysis on argument length:

  -- Evaluate function against STRING valued keys
  EXPLAIN
  SELECT sort(array("b", "d", "c", "a")) FROM src LIMIT 1
  2012-01-09 11:31:55,134 INFO  parse.ParseDriver (ParseDriver.java:parse(426)) - Parsing
command:

  -- Evaluate function against STRING valued keys
  EXPLAIN
  SELECT sort(array("b", "d", "c", "a")) FROM src LIMIT 1
  2012-01-09 11:31:55,146 INFO  parse.ParseDriver (ParseDriver.java:parse(443)) - Parse Completed
  2012-01-09 11:31:55,147 INFO  parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(7445))
- Starting Semantic Analysis
  2012-01-09 11:31:55,148 INFO  parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(7475))
- Completed phase 1 of Semantic Analysis
  2012-01-09 11:31:55,148 INFO  parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(942))
- Get metadata for source tables
  2012-01-09 11:31:55,149 INFO  metastore.HiveMetaStore (HiveMetaStore.java:logInfo(528))
- 0: get_table : db=default tbl=src
  2012-01-09 11:31:55,200 INFO  hive.log (MetaStoreUtils.java:getDDLFromFieldSchema(457))
- DDL: struct src { string key, string value}
  2012-01-09 11:31:55,200 DEBUG lazy.LazySimpleSerDe (LazySimpleSerDe.java:initialize(195))
- org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[key, value]
columnTypes=[string, string] separator=[[B@3bb20e65] nullstring=\N lastColumnTakesRest=false
  2012-01-09 11:31:55,200 INFO  parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1021))
- Get metadata for subqueries
  2012-01-09 11:31:55,201 INFO  parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1035))
- Get metadata for destination tables
  2012-01-09 11:31:55,201 INFO  parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(7478))
- Completed getting MetaData in Semantic Analysis
  2012-01-09 11:31:55,203 INFO  hive.log (MetaStoreUtils.java:getDDLFromFieldSchema(457))
- DDL: struct src { string key, string value}
  2012-01-09 11:31:55,203 DEBUG lazy.LazySimpleSerDe (LazySimpleSerDe.java:initialize(195))
- org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[key, value]
columnTypes=[string, string] separator=[[B@12e84396] nullstring=\N lastColumnTakesRest=false
  2012-01-09 11:31:55,222 DEBUG parse.SemanticAnalyzer (SemanticAnalyzer.java:genTablePlan(6598))
- Created Table Plan for src org.apache.hadoop.hive.ql.exec.TableScanOperator@5e9ea579
  2012-01-09 11:31:55,223 DEBUG parse.SemanticAnalyzer (SemanticAnalyzer.java:genSelectPlan(2117))
- tree: (TOK_SELECT (TOK_SELEXPR (TOK_FUNCTION sort (TOK_FUNCTION array "b" "d" "c" "a"))))
  2012-01-09 11:31:55,225 DEBUG parse.SemanticAnalyzer (SemanticAnalyzer.java:genSelectPlan(2222))
- genSelectPlan: input = src{(key,key: string)(value,value: string)(block__offset__inside__file,BLOCK__OFFSET__INSIDE__FILE:
bigint)(input__file__name,INPUT__FILE__NAME: string)}
  2012-01-09 11:31:55,234 ERROR ql.Driver (SessionState.java:printError(380)) - FAILED: Error
in semantic analysis: Line 5:7 Arguments length mismatch 'sort': The function SORT(array(obj1,
obj2,...)) needs one argument.
  org.apache.hadoop.hive.ql.parse.SemanticException: Line 5:7 Arguments length mismatch 'sort':
The function SORT(array(obj1, obj2,...)) needs one argument.
      at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:810)
      at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
      at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
      at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125)
      at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
      at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:161)
      at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7708)
      at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2301

  The same thing happens when I was doing format().

REVISION DETAIL
  https://reviews.facebook.net/D1125

                
> Implement sort(array) UDF
> -------------------------
>
>                 Key: HIVE-2279
>                 URL: https://issues.apache.org/jira/browse/HIVE-2279
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>            Reporter: Carl Steinbach
>            Assignee: Zhenxiao Luo
>         Attachments: HIVE-2279.D1059.1.patch, HIVE-2279.D1101.1.patch, HIVE-2279.D1107.1.patch,
HIVE-2279.D1125.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message