hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chengxiang li" <>
Subject Review Request 25495: HIVE-7776, enable sample10.q
Date Wed, 10 Sep 2014 08:48:11 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for hive, Brock Noland and Xuefu Zhang.

Bugs: HIVE-7776

Repository: hive-git


Hive get task Id through 2 ways in Utilities::getTaskId:
get parameter value of from configuration.
generate random value while #1 return null.
Currently, Hive on Spark can't get parameter value of from configuration.
FileSinkOperator use taskid to distinct different bucket file name, FileSinkOperator should
take taskid as field variable and initiate it only once since one FileSinkOperator instance
only refered in one task. but FileSinkOperator call Utilities::getTaskId to get new taskId
each time, for this issue, it would cause more bucket files than bucket number, which lead
to unexpected result of tablesample queries.


  itests/src/test/resources/ 155abad 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ 3ff0782 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ 02f9d99 
  ql/src/test/results/clientpositive/spark/sample10.q.out PRE-CREATION 




chengxiang li

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message