hive-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mmccl...@apache.org
Subject hive git commit: HIVE-12435 SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled. (Matt McCline, reviewed by Prasanth J)
Date Wed, 16 Dec 2015 09:19:38 GMT
Repository: hive
Updated Branches:
  refs/heads/branch-1 e2c8bfa12 -> 26728a8a3


HIVE-12435 SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled. (Matt McCline, reviewed by Prasanth J)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/26728a8a
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/26728a8a
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/26728a8a

Branch: refs/heads/branch-1
Commit: 26728a8a32f8259753e953d9c1a801d949aff5e3
Parents: e2c8bfa
Author: Matt McCline <mmccline@hortonworks.com>
Authored: Mon Dec 14 14:12:48 2015 -0800
Committer: Matt McCline <mmccline@hortonworks.com>
Committed: Wed Dec 16 01:19:27 2015 -0800

----------------------------------------------------------------------
 .../test/resources/testconfiguration.properties |    1 +
 .../resources/testconfiguration.properties.orig | 1190 ++++++++++++++++++
 .../ql/exec/vector/VectorizedBatchUtil.java     |   13 +-
 .../exec/vector/VectorizedBatchUtil.java.orig   |  707 +++++++++++
 .../ql/exec/vector/udf/VectorUDFArgDesc.java    |   12 +
 .../clientpositive/vector_when_case_null.q      |   14 +
 .../tez/vector_select_null2.q.out               |   95 ++
 .../tez/vector_when_case_null.q.out             |   96 ++
 .../clientpositive/vector_when_case_null.q.out  |   89 ++
 9 files changed, 2215 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hive/blob/26728a8a/itests/src/test/resources/testconfiguration.properties
----------------------------------------------------------------------
diff --git a/itests/src/test/resources/testconfiguration.properties b/itests/src/test/resources/testconfiguration.properties
index 03b07ce..1c8a80d 100644
--- a/itests/src/test/resources/testconfiguration.properties
+++ b/itests/src/test/resources/testconfiguration.properties
@@ -267,6 +267,7 @@ minitez.query.files.shared=acid_globallimit.q,\
   vector_varchar_4.q,\
   vector_varchar_mapjoin1.q,\
   vector_varchar_simple.q,\
+  vector_when_case_null.q,\
   vectorization_0.q,\
   vectorization_1.q,\
   vectorization_10.q,\

http://git-wip-us.apache.org/repos/asf/hive/blob/26728a8a/itests/src/test/resources/testconfiguration.properties.orig
----------------------------------------------------------------------
diff --git a/itests/src/test/resources/testconfiguration.properties.orig b/itests/src/test/resources/testconfiguration.properties.orig
new file mode 100644
index 0000000..03b07ce
--- /dev/null
+++ b/itests/src/test/resources/testconfiguration.properties.orig
@@ -0,0 +1,1190 @@
+# NOTE: files should be listed in alphabetical order
+minimr.query.files=auto_sortmerge_join_16.q,\
+  bucket4.q,\
+  bucket5.q,\
+  bucket6.q,\
+  bucket_many.q,\
+  bucket_num_reducers.q,\
+  bucket_num_reducers2.q,\
+  bucketizedhiveinputformat.q,\
+  bucketmapjoin6.q,\
+  bucketmapjoin7.q,\
+  constprog_partitioner.q,\
+  disable_merge_for_bucketing.q,\
+  empty_dir_in_table.q,\
+  exchgpartition2lel.q,\
+  external_table_with_space_in_location_path.q,\
+  file_with_header_footer.q,\
+  groupby2.q,\
+  import_exported_table.q,\
+  index_bitmap3.q,\
+  index_bitmap_auto.q,\
+  infer_bucket_sort_bucketed_table.q,\
+  infer_bucket_sort_dyn_part.q,\
+  infer_bucket_sort_map_operators.q,\
+  infer_bucket_sort_merge.q,\
+  infer_bucket_sort_num_buckets.q,\
+  infer_bucket_sort_reducers_power_two.q,\
+  input16_cc.q,\
+  insert_dir_distcp.q,\
+  join1.q,\
+  join_acid_non_acid.q,\
+  leftsemijoin_mr.q,\
+  list_bucket_dml_10.q,\
+  load_fs2.q,\
+  load_hdfs_file_with_space_in_the_name.q,\
+  non_native_window_udf.q, \
+  orc_merge_diff_fs.q,\
+  optrstat_groupby.q,\
+  parallel_orderby.q,\
+  ql_rewrite_gbtoidx.q,\
+  ql_rewrite_gbtoidx_cbo_1.q,\
+  ql_rewrite_gbtoidx_cbo_2.q,\
+  quotedid_smb.q,\
+  reduce_deduplicate.q,\
+  remote_script.q,\
+  root_dir_external_table.q,\
+  schemeAuthority.q,\
+  schemeAuthority2.q,\
+  scriptfile1.q,\
+  scriptfile1_win.q,\
+  skewjoin_onesideskew.q,\
+  smb_mapjoin_8.q,\
+  stats_counter.q,\
+  stats_counter_partitioned.q,\
+  table_nonprintable.q,\
+  temp_table_external.q,\
+  truncate_column_buckets.q,\
+  uber_reduce.q,\
+  udf_using.q
+
+minitez.query.files.shared=acid_globallimit.q,\
+  alter_merge_2_orc.q,\
+  alter_merge_orc.q,\
+  alter_merge_stats_orc.q,\
+  auto_join0.q,\
+  auto_join1.q,\
+  bucket2.q,\
+  bucket3.q,\
+  bucket4.q,\
+  cbo_gby.q,\
+  cbo_gby_empty.q,\
+  cbo_join.q,\
+  cbo_limit.q,\
+  cbo_semijoin.q,\
+  cbo_simple_select.q,\
+  cbo_stats.q,\
+  cbo_subq_exists.q,\
+  cbo_subq_in.q,\
+  cbo_subq_not_in.q,\
+  cbo_udf_udaf.q,\
+  cbo_union.q,\
+  cbo_views.q,\
+  cbo_windowing.q,\
+  correlationoptimizer1.q,\
+  count.q,\
+  create_merge_compressed.q,\
+  cross_join.q,\
+  cross_product_check_1.q,\
+  cross_product_check_2.q,\
+  ctas.q,\
+  custom_input_output_format.q,\
+  delete_all_non_partitioned.q,\
+  delete_all_partitioned.q,\
+  delete_orig_table.q,\
+  delete_tmp_table.q,\
+  delete_where_no_match.q,\
+  delete_where_non_partitioned.q,\
+  delete_where_partitioned.q,\
+  delete_whole_partition.q,\
+  disable_merge_for_bucketing.q,\
+  dynpart_sort_opt_vectorization.q,\
+  dynpart_sort_optimization.q,\
+  dynpart_sort_optimization2.q,\
+  enforce_order.q,\
+  filter_join_breaktask.q,\
+  filter_join_breaktask2.q,\
+  groupby1.q,\
+  groupby2.q,\
+  groupby3.q,\
+  having.q,\
+  identity_project_remove_skip.q\
+  insert1.q,\
+  insert_into1.q,\
+  insert_into2.q,\
+  insert_orig_table.q,\
+  insert_values_dynamic_partitioned.q,\
+  insert_values_non_partitioned.q,\
+  insert_values_orig_table.q\
+  insert_values_partitioned.q,\
+  insert_values_tmp_table.q,\
+  insert_update_delete.q,\
+  join0.q,\
+  join1.q,\
+  join_nullsafe.q,\
+  leftsemijoin.q,\
+  limit_pushdown.q,\
+  load_dyn_part1.q,\
+  load_dyn_part2.q,\
+  load_dyn_part3.q,\
+  mapjoin_mapjoin.q,\
+  mapreduce1.q,\
+  mapreduce2.q,\
+  merge1.q,\
+  merge2.q,\
+  mergejoin.q,\
+  metadataonly1.q,\
+  metadata_only_queries.q,\
+  optimize_nullscan.q,\
+  orc_analyze.q,\
+  orc_merge1.q,\
+  orc_merge2.q,\
+  orc_merge3.q,\
+  orc_merge4.q,\
+  orc_merge5.q,\
+  orc_merge6.q,\
+  orc_merge7.q,\
+  orc_merge8.q,\
+  orc_merge9.q,\
+  orc_merge10.q,\
+  orc_merge11.q,\
+  orc_merge_incompat1.q,\
+  orc_merge_incompat2.q,\
+  orc_vectorization_ppd.q,\
+  parallel.q,\
+  ptf.q,\
+  ptf_matchpath.q,\
+  ptf_streaming.q,\
+  sample1.q,\
+  selectDistinctStar.q,\
+  script_env_var1.q,\
+  script_env_var2.q,\
+  script_pipe.q,\
+  scriptfile1.q,\
+  select_dummy_source.q,\
+  skewjoin.q,\
+  stats_counter.q,\
+  stats_counter_partitioned.q,\
+  stats_noscan_1.q,\
+  stats_only_null.q,\
+  subquery_exists.q,\
+  subquery_in.q,\
+  temp_table.q,\
+  transform1.q,\
+  transform2.q,\
+  transform_ppr1.q,\
+  transform_ppr2.q,\
+  union2.q,\
+  union3.q,\
+  union4.q,\
+  union5.q,\
+  union6.q,\
+  union7.q,\
+  union8.q,\
+  union9.q,\
+  unionDistinct_1.q,\
+  unionDistinct_2.q,\
+  union_fast_stats.q,\
+  update_after_multiple_inserts.q,\
+  update_all_non_partitioned.q,\
+  update_all_partitioned.q,\
+  update_all_types.q,\
+  update_orig_table.q,\
+  update_tmp_table.q,\
+  update_where_no_match.q,\
+  update_where_non_partitioned.q,\
+  update_where_partitioned.q,\
+  update_two_cols.q,\
+  vector_acid3.q,\
+  vector_aggregate_9.q,\
+  vector_auto_smb_mapjoin_14.q,\
+  vector_between_in.q,\
+  vector_between_columns.q,\
+  vector_binary_join_groupby.q,\
+  vector_bucket.q,\
+  vector_char_cast.q,\
+  vector_cast_constant.q,\
+  vector_char_2.q,\
+  vector_char_4.q,\
+  vector_char_mapjoin1.q,\
+  vector_char_simple.q,\
+  vector_coalesce.q,\
+  vector_coalesce_2.q,\
+  vector_count_distinct.q,\
+  vector_data_types.q,\
+  vector_date_1.q,\
+  vector_decimal_1.q,\
+  vector_decimal_10_0.q,\
+  vector_decimal_2.q,\
+  vector_decimal_3.q,\
+  vector_decimal_4.q,\
+  vector_decimal_5.q,\
+  vector_decimal_6.q,\
+  vector_decimal_aggregate.q,\
+  vector_decimal_cast.q,\
+  vector_decimal_expressions.q,\
+  vector_decimal_mapjoin.q,\
+  vector_decimal_math_funcs.q,\
+  vector_decimal_precision.q,\
+  vector_decimal_round.q,\
+  vector_decimal_round_2.q,\
+  vector_decimal_trailing.q,\
+  vector_decimal_udf.q,\
+  vector_decimal_udf2.q,\
+  vector_distinct_2.q,\
+  vector_elt.q,\
+  vector_groupby_3.q,\
+  vector_groupby_reduce.q,\
+  vector_grouping_sets.q,\
+  vector_if_expr.q,\
+  vector_inner_join.q,\
+  vector_interval_1.q,\
+  vector_interval_2.q,\
+  vector_join30.q,\
+  vector_join_filters.q,\
+  vector_join_nulls.q,\
+  vector_left_outer_join.q,\
+  vector_left_outer_join2.q,\
+  vector_leftsemi_mapjoin.q,\
+  vector_mapjoin_reduce.q,\
+  vector_mr_diff_schema_alias.q,\
+  vector_multi_insert.q,\
+  vector_non_string_partition.q,\
+  vector_nullsafe_join.q,\
+  vector_null_projection.q,\
+  vector_orderby_5.q,\
+  vector_outer_join0.q,\
+  vector_outer_join1.q,\
+  vector_outer_join2.q,\
+  vector_outer_join3.q,\
+  vector_outer_join4.q,\
+  vector_outer_join5.q,\
+  vector_outer_join6.q,\
+  vector_partition_diff_num_cols.q,\
+  vector_partitioned_date_time.q,\
+  vector_reduce_groupby_decimal.q,\
+  vector_string_concat.q,\
+  vector_varchar_4.q,\
+  vector_varchar_mapjoin1.q,\
+  vector_varchar_simple.q,\
+  vectorization_0.q,\
+  vectorization_1.q,\
+  vectorization_10.q,\
+  vectorization_11.q,\
+  vectorization_12.q,\
+  vectorization_13.q,\
+  vectorization_14.q,\
+  vectorization_15.q,\
+  vectorization_16.q,\
+  vectorization_17.q,\
+  vectorization_2.q,\
+  vectorization_3.q,\
+  vectorization_4.q,\
+  vectorization_5.q,\
+  vectorization_6.q,\
+  vectorization_7.q,\
+  vectorization_8.q,\
+  vectorization_9.q,\
+  vectorization_decimal_date.q,\
+  vectorization_div0.q,\
+  vectorization_limit.q,\
+  vectorization_nested_udf.q,\
+  vectorization_not.q,\
+  vectorization_part.q,\
+  vectorization_part_project.q,\
+  vectorization_pushdown.q,\
+  vectorization_short_regress.q,\
+  vectorized_bucketmapjoin1.q,\
+  vectorized_case.q,\
+  vectorized_casts.q,\
+  vectorized_context.q,\
+  vectorized_date_funcs.q,\
+  vectorized_distinct_gby.q,\
+  vectorized_mapjoin.q,\
+  vectorized_math_funcs.q,\
+  vectorized_nested_mapjoin.q,\
+  vectorized_parquet.q,\
+  vectorized_ptf.q,\
+  vectorized_rcfile_columnar.q,\
+  vectorized_shufflejoin.q,\
+  vectorized_string_funcs.q,\
+  vectorized_timestamp_funcs.q,\
+  auto_sortmerge_join_1.q,\
+  auto_sortmerge_join_10.q,\
+  auto_sortmerge_join_11.q,\
+  auto_sortmerge_join_12.q,\
+  auto_sortmerge_join_13.q,\
+  auto_sortmerge_join_14.q,\
+  auto_sortmerge_join_15.q,\
+  auto_sortmerge_join_16.q,\
+  auto_sortmerge_join_2.q,\
+  auto_sortmerge_join_3.q,\
+  auto_sortmerge_join_4.q,\
+  auto_sortmerge_join_5.q,\
+  auto_sortmerge_join_7.q,\
+  auto_sortmerge_join_8.q,\
+  auto_sortmerge_join_9.q,\
+  auto_join30.q,\
+  auto_join21.q,\
+  auto_join29.q,\
+  auto_join_filters.q
+
+
+minitez.query.files=bucket_map_join_tez1.q,\
+  bucket_map_join_tez2.q,\
+  dynamic_partition_pruning.q,\
+  dynamic_partition_pruning_2.q,\
+  explainuser_1.q,\
+  explainuser_2.q,\
+  hybridgrace_hashjoin_1.q,\
+  hybridgrace_hashjoin_2.q,\
+  mapjoin_decimal.q,\
+  lvj_mapjoin.q, \
+  mergejoin_3way.q,\
+  mrr.q,\
+  orc_ppd_basic.q,\
+  orc_merge_diff_fs.q,\
+  tez_bmj_schema_evolution.q,\
+  tez_dml.q,\
+  tez_fsstat.q,\
+  tez_insert_overwrite_local_directory_1.q,\
+  tez_dynpart_hashjoin_1.q,\
+  tez_dynpart_hashjoin_2.q,\
+  tez_vector_dynpart_hashjoin_1.q,\
+  tez_vector_dynpart_hashjoin_2.q,\
+  tez_join_hash.q,\
+  tez_join_result_complex.q,\
+  tez_join_tests.q,\
+  tez_joins_explain.q,\
+  tez_schema_evolution.q,\
+  tez_self_join.q,\
+  tez_union.q,\
+  tez_union2.q,\
+  tez_union_dynamic_partition.q,\
+  tez_union_view.q,\
+  tez_union_decimal.q,\
+  tez_union_group_by.q,\
+  tez_union_with_udf.q,\
+  tez_smb_main.q,\
+  tez_smb_1.q,\
+  tez_smb_empty.q,\
+  vector_join_part_col_char.q,\
+  vectorized_dynamic_partition_pruning.q,\
+  tez_multi_union.q,\
+  tez_join.q,\
+  tez_union_multiinsert.q
+
+encrypted.query.files=encryption_join_unencrypted_tbl.q,\
+  encryption_insert_partition_static.q,\
+  encryption_insert_partition_dynamic.q,\
+  encryption_join_with_different_encryption_keys.q,\
+  encryption_select_read_only_encrypted_tbl.q,\
+  encryption_select_read_only_unencrypted_tbl.q,\
+  encryption_load_data_to_encrypted_tables.q, \
+  encryption_unencrypted_nonhdfs_external_tables.q \
+  encryption_move_tbl.q \
+  encryption_drop_table.q \
+  encryption_insert_values.q \
+  encryption_drop_view.q \
+  encryption_drop_partition.q \
+  encryption_with_trash.q
+
+beeline.positive.exclude=add_part_exist.q,\
+  alter1.q,\
+  alter2.q,\
+  alter4.q,\
+  alter5.q,\
+  alter_rename_partition.q,\
+  alter_rename_partition_authorization.q,\
+  archive.q,\
+  archive_corrupt.q,\
+  archive_mr_1806.q,\
+  archive_multi.q,\
+  archive_multi_mr_1806.q,\
+  authorization_1.q,\
+  authorization_2.q,\
+  authorization_4.q,\
+  authorization_5.q,\
+  authorization_6.q,\
+  authorization_7.q,\
+  ba_table1.q,\
+  ba_table2.q,\
+  ba_table3.q,\
+  ba_table_udfs.q,\
+  binary_table_bincolserde.q,\
+  binary_table_colserde.q,\
+  cluster.q,\
+  columnarserde_create_shortcut.q,\
+  combine2.q,\
+  constant_prop.q,\
+  create_nested_type.q,\
+  create_or_replace_view.q,\
+  create_struct_table.q,\
+  create_union_table.q,\
+  database.q,\
+  database_location.q,\
+  database_properties.q,\
+  ddltime.q,\
+  describe_database_json.q,\
+  drop_database_removes_partition_dirs.q,\
+  escape1.q,\
+  escape2.q,\
+  exim_00_nonpart_empty.q,\
+  exim_01_nonpart.q,\
+  exim_02_00_part_empty.q,\
+  exim_02_part.q,\
+  exim_03_nonpart_over_compat.q,\
+  exim_04_all_part.q,\
+  exim_04_evolved_parts.q,\
+  exim_05_some_part.q,\
+  exim_06_one_part.q,\
+  exim_07_all_part_over_nonoverlap.q,\
+  exim_08_nonpart_rename.q,\
+  exim_09_part_spec_nonoverlap.q,\
+  exim_10_external_managed.q,\
+  exim_11_managed_external.q,\
+  exim_12_external_location.q,\
+  exim_13_managed_location.q,\
+  exim_14_managed_location_over_existing.q,\
+  exim_15_external_part.q,\
+  exim_16_part_external.q,\
+  exim_17_part_managed.q,\
+  exim_18_part_external.q,\
+  exim_19_00_part_external_location.q,\
+  exim_19_part_external_location.q,\
+  exim_20_part_managed_location.q,\
+  exim_21_export_authsuccess.q,\
+  exim_22_import_exist_authsuccess.q,\
+  exim_23_import_part_authsuccess.q,\
+  exim_24_import_nonexist_authsuccess.q,\
+  global_limit.q,\
+  groupby_complex_types.q,\
+  groupby_complex_types_multi_single_reducer.q,\
+  index_auth.q,\
+  index_auto.q,\
+  index_auto_empty.q,\
+  index_bitmap.q,\
+  index_bitmap1.q,\
+  index_bitmap2.q,\
+  index_bitmap3.q,\
+  index_bitmap_auto.q,\
+  index_bitmap_rc.q,\
+  index_compact.q,\
+  index_compact_1.q,\
+  index_compact_2.q,\
+  index_compact_3.q,\
+  index_stale_partitioned.q,\
+  init_file.q,\
+  input16.q,\
+  input16_cc.q,\
+  input46.q,\
+  input_columnarserde.q,\
+  input_dynamicserde.q,\
+  input_lazyserde.q,\
+  input_testxpath3.q,\
+  input_testxpath4.q,\
+  insert2_overwrite_partitions.q,\
+  insertexternal1.q,\
+  join_thrift.q,\
+  lateral_view.q,\
+  load_binary_data.q,\
+  load_exist_part_authsuccess.q,\
+  load_nonpart_authsuccess.q,\
+  load_part_authsuccess.q,\
+  loadpart_err.q,\
+  lock1.q,\
+  lock2.q,\
+  lock3.q,\
+  lock4.q,\
+  merge_dynamic_partition.q,\
+  multi_insert.q,\
+  multi_insert_move_tasks_share_dependencies.q,\
+  null_column.q,\
+  ppd_clusterby.q,\
+  query_with_semi.q,\
+  rename_column.q,\
+  sample6.q,\
+  sample_islocalmode_hook.q,\
+  set_processor_namespaces.q,\
+  show_tables.q,\
+  source.q,\
+  split_sample.q,\
+  str_to_map.q,\
+  transform1.q,\
+  udaf_collect_set.q,\
+  udaf_context_ngrams.q,\
+  udaf_histogram_numeric.q,\
+  udaf_ngrams.q,\
+  udaf_percentile_approx.q,\
+  udf_array.q,\
+  udf_bitmap_and.q,\
+  udf_bitmap_or.q,\
+  udf_explode.q,\
+  udf_format_number.q,\
+  udf_map.q,\
+  udf_map_keys.q,\
+  udf_map_values.q,\
+  udf_max.q,\
+  udf_min.q,\
+  udf_named_struct.q,\
+  udf_percentile.q,\
+  udf_printf.q,\
+  udf_sentences.q,\
+  udf_sort_array.q,\
+  udf_split.q,\
+  udf_struct.q,\
+  udf_substr.q,\
+  udf_translate.q,\
+  udf_union.q,\
+  udf_xpath.q,\
+  udtf_stack.q,\
+  view.q,\
+  virtual_column.q
+
+minimr.query.negative.files=cluster_tasklog_retrieval.q,\
+  file_with_header_footer_negative.q,\
+  local_mapred_error_cache.q,\
+  mapreduce_stack_trace.q,\
+  mapreduce_stack_trace_hadoop20.q,\
+  mapreduce_stack_trace_turnoff.q,\
+  mapreduce_stack_trace_turnoff_hadoop20.q,\
+  minimr_broken_pipe.q,\
+  table_nonprintable_negative.q,\
+  udf_local_resource.q
+
+# tests are sorted use: perl -pe 's@\\\s*\n@ @g' testconfiguration.properties \
+# | awk -F= '/spark.query.files/{print $2}' | perl -pe 's@.q *, *@\n@g' \
+# | egrep -v '^ *$' |  sort -V | uniq | perl -pe 's@\n@.q, \\\n@g' | perl -pe 's@^@  @g'
+spark.query.files=add_part_multiple.q, \
+  alter_merge_orc.q, \
+  alter_merge_stats_orc.q, \
+  annotate_stats_join.q, \
+  auto_join0.q, \
+  auto_join1.q, \
+  auto_join10.q, \
+  auto_join11.q, \
+  auto_join12.q, \
+  auto_join13.q, \
+  auto_join14.q, \
+  auto_join15.q, \
+  auto_join16.q, \
+  auto_join17.q, \
+  auto_join18.q, \
+  auto_join18_multi_distinct.q, \
+  auto_join19.q, \
+  auto_join2.q, \
+  auto_join20.q, \
+  auto_join21.q, \
+  auto_join22.q, \
+  auto_join23.q, \
+  auto_join24.q, \
+  auto_join26.q, \
+  auto_join27.q, \
+  auto_join28.q, \
+  auto_join29.q, \
+  auto_join3.q, \
+  auto_join30.q, \
+  auto_join31.q, \
+  auto_join32.q, \
+  auto_join4.q, \
+  auto_join5.q, \
+  auto_join6.q, \
+  auto_join7.q, \
+  auto_join8.q, \
+  auto_join9.q, \
+  auto_join_filters.q, \
+  auto_join_nulls.q, \
+  auto_join_reordering_values.q, \
+  auto_join_stats.q, \
+  auto_join_stats2.q, \
+  auto_join_without_localtask.q, \
+  auto_smb_mapjoin_14.q, \
+  auto_sortmerge_join_1.q, \
+  auto_sortmerge_join_10.q, \
+  auto_sortmerge_join_12.q, \
+  auto_sortmerge_join_13.q, \
+  auto_sortmerge_join_14.q, \
+  auto_sortmerge_join_15.q, \
+  auto_sortmerge_join_16.q, \
+  auto_sortmerge_join_2.q, \
+  auto_sortmerge_join_3.q, \
+  auto_sortmerge_join_4.q, \
+  auto_sortmerge_join_5.q, \
+  auto_sortmerge_join_6.q, \
+  auto_sortmerge_join_7.q, \
+  auto_sortmerge_join_8.q, \
+  auto_sortmerge_join_9.q, \
+  avro_compression_enabled_native.q, \
+  avro_decimal_native.q, \
+  avro_joins.q, \
+  avro_joins_native.q, \
+  bucket2.q, \
+  bucket3.q, \
+  bucket4.q, \
+  bucket_map_join_1.q, \
+  bucket_map_join_2.q, \
+  bucket_map_join_spark1.q, \
+  bucket_map_join_spark2.q, \
+  bucket_map_join_spark3.q, \
+  bucket_map_join_spark4.q, \
+  bucket_map_join_tez1.q, \
+  bucket_map_join_tez2.q, \
+  bucketmapjoin1.q, \
+  bucketmapjoin10.q, \
+  bucketmapjoin11.q, \
+  bucketmapjoin12.q, \
+  bucketmapjoin13.q, \
+  bucketmapjoin2.q, \
+  bucketmapjoin3.q, \
+  bucketmapjoin4.q, \
+  bucketmapjoin5.q, \
+  bucketmapjoin7.q, \
+  bucketmapjoin8.q, \
+  bucketmapjoin9.q, \
+  bucketmapjoin_negative.q, \
+  bucketmapjoin_negative2.q, \
+  bucketmapjoin_negative3.q, \
+  bucketsortoptimize_insert_2.q, \
+  bucketsortoptimize_insert_4.q, \
+  bucketsortoptimize_insert_6.q, \
+  bucketsortoptimize_insert_7.q, \
+  bucketsortoptimize_insert_8.q, \
+  cbo_gby.q, \
+  cbo_gby_empty.q, \
+  cbo_limit.q, \
+  cbo_semijoin.q, \
+  cbo_simple_select.q, \
+  cbo_stats.q, \
+  cbo_subq_in.q, \
+  cbo_subq_not_in.q, \
+  cbo_udf_udaf.q, \
+  cbo_union.q, \
+  column_access_stats.q, \
+  count.q, \
+  create_merge_compressed.q, \
+  cross_join.q, \
+  cross_product_check_1.q, \
+  cross_product_check_2.q, \
+  ctas.q, \
+  custom_input_output_format.q, \
+  date_join1.q, \
+  date_udf.q, \
+  decimal_1_1.q, \
+  decimal_join.q, \
+  disable_merge_for_bucketing.q, \
+  dynamic_rdd_cache.q, \
+  enforce_order.q, \
+  escape_clusterby1.q, \
+  escape_distributeby1.q, \
+  escape_orderby1.q, \
+  escape_sortby1.q, \
+  filter_join_breaktask.q, \
+  filter_join_breaktask2.q, \
+  groupby1.q, \
+  groupby10.q, \
+  groupby11.q, \
+  groupby2.q, \
+  groupby3.q, \
+  groupby3_map.q, \
+  groupby3_map_multi_distinct.q, \
+  groupby3_map_skew.q, \
+  groupby3_noskew.q, \
+  groupby3_noskew_multi_distinct.q, \
+  groupby4.q, \
+  groupby7.q, \
+  groupby7_map.q, \
+  groupby7_map_multi_single_reducer.q, \
+  groupby7_map_skew.q, \
+  groupby7_noskew.q, \
+  groupby7_noskew_multi_single_reducer.q, \
+  groupby8.q, \
+  groupby8_map.q, \
+  groupby8_map_skew.q, \
+  groupby8_noskew.q, \
+  groupby9.q, \
+  groupby_bigdata.q, \
+  groupby_complex_types.q, \
+  groupby_complex_types_multi_single_reducer.q, \
+  groupby_cube1.q, \
+  groupby_map_ppr.q, \
+  groupby_map_ppr_multi_distinct.q, \
+  groupby_multi_insert_common_distinct.q, \
+  groupby_multi_single_reducer.q, \
+  groupby_multi_single_reducer2.q, \
+  groupby_multi_single_reducer3.q, \
+  groupby_position.q, \
+  groupby_ppr.q, \
+  groupby_rollup1.q, \
+  groupby_sort_1_23.q, \
+  groupby_sort_skew_1_23.q, \
+  having.q, \
+  identity_project_remove_skip.q, \
+  index_auto_self_join.q, \
+  innerjoin.q, \
+  input12.q, \
+  input13.q, \
+  input14.q, \
+  input17.q, \
+  input18.q, \
+  input1_limit.q, \
+  input_part2.q, \
+  insert1.q, \
+  insert_into1.q, \
+  insert_into2.q, \
+  insert_into3.q, \
+  join0.q, \
+  join1.q, \
+  join10.q, \
+  join11.q, \
+  join12.q, \
+  join13.q, \
+  join14.q, \
+  join15.q, \
+  join16.q, \
+  join17.q, \
+  join18.q, \
+  join18_multi_distinct.q, \
+  join19.q, \
+  join2.q, \
+  join20.q, \
+  join21.q, \
+  join22.q, \
+  join23.q, \
+  join24.q, \
+  join25.q, \
+  join26.q, \
+  join27.q, \
+  join28.q, \
+  join29.q, \
+  join3.q, \
+  join30.q, \
+  join31.q, \
+  join32.q, \
+  join32_lessSize.q, \
+  join33.q, \
+  join34.q, \
+  join35.q, \
+  join36.q, \
+  join37.q, \
+  join38.q, \
+  join39.q, \
+  join4.q, \
+  join40.q, \
+  join41.q, \
+  join5.q, \
+  join6.q, \
+  join7.q, \
+  join8.q, \
+  join9.q, \
+  join_1to1.q, \
+  join_alt_syntax.q, \
+  join_array.q, \
+  join_casesensitive.q, \
+  join_cond_pushdown_1.q, \
+  join_cond_pushdown_2.q, \
+  join_cond_pushdown_3.q, \
+  join_cond_pushdown_4.q, \
+  join_cond_pushdown_unqual1.q, \
+  join_cond_pushdown_unqual2.q, \
+  join_cond_pushdown_unqual3.q, \
+  join_cond_pushdown_unqual4.q, \
+  join_empty.q, \
+  join_filters_overlap.q, \
+  join_hive_626.q, \
+  join_literals.q, \
+  join_map_ppr.q, \
+  join_merge_multi_expressions.q, \
+  join_merging.q, \
+  join_nullsafe.q, \
+  join_rc.q, \
+  join_reorder.q, \
+  join_reorder2.q, \
+  join_reorder3.q, \
+  join_reorder4.q, \
+  join_star.q, \
+  join_thrift.q, \
+  join_vc.q, \
+  join_view.q, \
+  lateral_view_explode2.q, \
+  leftsemijoin.q, \
+  leftsemijoin_mr.q, \
+  limit_partition_metadataonly.q, \
+  limit_pushdown.q, \
+  list_bucket_dml_2.q, \
+  load_dyn_part1.q, \
+  load_dyn_part10.q, \
+  load_dyn_part11.q, \
+  load_dyn_part12.q, \
+  load_dyn_part13.q, \
+  load_dyn_part14.q, \
+  load_dyn_part15.q, \
+  load_dyn_part2.q, \
+  load_dyn_part3.q, \
+  load_dyn_part4.q, \
+  load_dyn_part5.q, \
+  load_dyn_part6.q, \
+  load_dyn_part7.q, \
+  load_dyn_part8.q, \
+  load_dyn_part9.q, \
+  louter_join_ppr.q, \
+  mapjoin1.q, \
+  mapjoin_addjar.q, \
+  mapjoin_decimal.q, \
+  mapjoin_distinct.q, \
+  mapjoin_filter_on_outerjoin.q, \
+  mapjoin_mapjoin.q, \
+  mapjoin_memcheck.q, \
+  mapjoin_subquery.q, \
+  mapjoin_subquery2.q, \
+  mapjoin_test_outer.q, \
+  mapreduce1.q, \
+  mapreduce2.q, \
+  merge1.q, \
+  merge2.q, \
+  mergejoins.q, \
+  mergejoins_mixed.q, \
+  metadata_only_queries.q, \
+  metadata_only_queries_with_filters.q, \
+  multi_insert.q, \
+  multi_insert_gby.q, \
+  multi_insert_gby2.q, \
+  multi_insert_gby3.q, \
+  multi_insert_lateral_view.q, \
+  multi_insert_mixed.q, \
+  multi_insert_move_tasks_share_dependencies.q, \
+  multi_join_union.q, \
+  multi_join_union_src.q, \
+  multigroupby_singlemr.q, \
+  optimize_nullscan.q, \
+  order.q, \
+  order2.q, \
+  outer_join_ppr.q, \
+  parallel.q, \
+  parallel_join0.q, \
+  parallel_join1.q, \
+  parquet_join.q, \
+  pcr.q, \
+  ppd_gby_join.q, \
+  ppd_join.q, \
+  ppd_join2.q, \
+  ppd_join3.q, \
+  ppd_join4.q, \
+  ppd_join5.q, \
+  ppd_join_filter.q, \
+  ppd_multi_insert.q, \
+  ppd_outer_join1.q, \
+  ppd_outer_join2.q, \
+  ppd_outer_join3.q, \
+  ppd_outer_join4.q, \
+  ppd_outer_join5.q, \
+  ppd_transform.q, \
+  ptf.q, \
+  ptf_decimal.q, \
+  ptf_general_queries.q, \
+  ptf_matchpath.q, \
+  ptf_rcfile.q, \
+  ptf_register_tblfn.q, \
+  ptf_seqfile.q, \
+  ptf_streaming.q, \
+  rcfile_bigdata.q, \
+  reduce_deduplicate_exclude_join.q, \
+  router_join_ppr.q, \
+  runtime_skewjoin_mapjoin_spark.q, \
+  sample1.q, \
+  sample10.q, \
+  sample2.q, \
+  sample3.q, \
+  sample4.q, \
+  sample5.q, \
+  sample6.q, \
+  sample7.q, \
+  sample8.q, \
+  sample9.q, \
+  script_env_var1.q, \
+  script_env_var2.q, \
+  script_pipe.q, \
+  scriptfile1.q, \
+  semijoin.q, \
+  skewjoin.q, \
+  skewjoin_noskew.q, \
+  skewjoin_union_remove_1.q, \
+  skewjoin_union_remove_2.q, \
+  skewjoinopt1.q, \
+  skewjoinopt10.q, \
+  skewjoinopt11.q, \
+  skewjoinopt12.q, \
+  skewjoinopt13.q, \
+  skewjoinopt14.q, \
+  skewjoinopt15.q, \
+  skewjoinopt16.q, \
+  skewjoinopt17.q, \
+  skewjoinopt18.q, \
+  skewjoinopt19.q, \
+  skewjoinopt2.q, \
+  skewjoinopt20.q, \
+  skewjoinopt3.q, \
+  skewjoinopt4.q, \
+  skewjoinopt5.q, \
+  skewjoinopt6.q, \
+  skewjoinopt7.q, \
+  skewjoinopt8.q, \
+  skewjoinopt9.q, \
+  smb_mapjoin_1.q, \
+  smb_mapjoin_10.q, \
+  smb_mapjoin_11.q, \
+  smb_mapjoin_12.q, \
+  smb_mapjoin_13.q, \
+  smb_mapjoin_14.q, \
+  smb_mapjoin_15.q, \
+  smb_mapjoin_16.q, \
+  smb_mapjoin_17.q, \
+  smb_mapjoin_18.q, \
+  smb_mapjoin_19.q, \
+  smb_mapjoin_2.q, \
+  smb_mapjoin_20.q, \
+  smb_mapjoin_21.q, \
+  smb_mapjoin_22.q, \
+  smb_mapjoin_25.q, \
+  smb_mapjoin_3.q, \
+  smb_mapjoin_4.q, \
+  smb_mapjoin_5.q, \
+  smb_mapjoin_6.q, \
+  smb_mapjoin_7.q, \
+  smb_mapjoin_8.q, \
+  smb_mapjoin_9.q, \
+  sort.q, \
+  stats0.q, \
+  stats1.q, \
+  stats10.q, \
+  stats12.q, \
+  stats13.q, \
+  stats14.q, \
+  stats15.q, \
+  stats16.q, \
+  stats18.q, \
+  stats2.q, \
+  stats20.q, \
+  stats3.q, \
+  stats5.q, \
+  stats6.q, \
+  stats7.q, \
+  stats8.q, \
+  stats9.q, \
+  stats_counter.q, \
+  stats_counter_partitioned.q, \
+  stats_noscan_1.q, \
+  stats_noscan_2.q, \
+  stats_only_null.q, \
+  stats_partscan_1_23.q, \
+  statsfs.q, \
+  subquery_exists.q, \
+  subquery_in.q, \
+  subquery_multiinsert.q, \
+  table_access_keys_stats.q, \
+  temp_table.q, \
+  temp_table_join1.q, \
+  tez_join_tests.q, \
+  tez_joins_explain.q, \
+  timestamp_1.q, \
+  timestamp_2.q, \
+  timestamp_3.q, \
+  timestamp_comparison.q, \
+  timestamp_lazy.q, \
+  timestamp_null.q, \
+  timestamp_udf.q, \
+  transform1.q, \
+  transform2.q, \
+  transform_ppr1.q, \
+  transform_ppr2.q, \
+  udf_example_add.q, \
+  udf_in_file.q, \
+  union.q, \
+  union10.q, \
+  union11.q, \
+  union12.q, \
+  union13.q, \
+  union14.q, \
+  union15.q, \
+  union16.q, \
+  union17.q, \
+  union18.q, \
+  union19.q, \
+  union2.q, \
+  union20.q, \
+  union21.q, \
+  union22.q, \
+  union23.q, \
+  union24.q, \
+  union25.q, \
+  union26.q, \
+  union27.q, \
+  union28.q, \
+  union29.q, \
+  union3.q, \
+  union30.q, \
+  union31.q, \
+  union32.q, \
+  union33.q, \
+  union34.q, \
+  union4.q, \
+  union5.q, \
+  union6.q, \
+  union7.q, \
+  union8.q, \
+  union9.q, \
+  union_date.q, \
+  union_date_trim.q, \
+  union_lateralview.q, \
+  union_null.q, \
+  union_ppr.q, \
+  union_remove_1.q, \
+  union_remove_10.q, \
+  union_remove_11.q, \
+  union_remove_12.q, \
+  union_remove_13.q, \
+  union_remove_14.q, \
+  union_remove_15.q, \
+  union_remove_16.q, \
+  union_remove_17.q, \
+  union_remove_18.q, \
+  union_remove_19.q, \
+  union_remove_2.q, \
+  union_remove_20.q, \
+  union_remove_21.q, \
+  union_remove_22.q, \
+  union_remove_23.q, \
+  union_remove_24.q, \
+  union_remove_25.q, \
+  union_remove_3.q, \
+  union_remove_4.q, \
+  union_remove_5.q, \
+  union_remove_6.q, \
+  union_remove_6_subq.q, \
+  union_remove_7.q, \
+  union_remove_8.q, \
+  union_remove_9.q, \
+  union_script.q, \
+  union_top_level.q, \
+  uniquejoin.q, \
+  union_view.q, \
+  varchar_join1.q, \
+  vector_between_in.q, \
+  vector_cast_constant.q, \
+  vector_char_4.q, \
+  vector_count_distinct.q, \
+  vector_data_types.q, \
+  vector_decimal_aggregate.q, \
+  vector_decimal_mapjoin.q, \
+  vector_distinct_2.q, \
+  vector_elt.q, \
+  vector_groupby_3.q, \
+  vector_left_outer_join.q, \
+  vector_mapjoin_reduce.q, \
+  vector_orderby_5.q, \
+  vector_string_concat.q, \
+  vector_varchar_4.q, \
+  vectorization_0.q, \
+  vectorization_1.q, \
+  vectorization_10.q, \
+  vectorization_11.q, \
+  vectorization_12.q, \
+  vectorization_13.q, \
+  vectorization_14.q, \
+  vectorization_15.q, \
+  vectorization_16.q, \
+  vectorization_17.q, \
+  vectorization_2.q, \
+  vectorization_3.q, \
+  vectorization_4.q, \
+  vectorization_5.q, \
+  vectorization_6.q, \
+  vectorization_9.q, \
+  vectorization_decimal_date.q, \
+  vectorization_div0.q, \
+  vectorization_nested_udf.q, \
+  vectorization_not.q, \
+  vectorization_part.q, \
+  vectorization_part_project.q, \
+  vectorization_pushdown.q, \
+  vectorization_short_regress.q, \
+  vectorized_case.q, \
+  vectorized_mapjoin.q, \
+  vectorized_math_funcs.q, \
+  vectorized_nested_mapjoin.q, \
+  vectorized_ptf.q, \
+  vectorized_rcfile_columnar.q, \
+  vectorized_shufflejoin.q, \
+  vectorized_string_funcs.q, \
+  vectorized_timestamp_funcs.q, \
+  windowing.q
+
+# Unlike "spark.query.files" above, these tests only run
+# under Spark engine.
+spark.only.query.files=spark_dynamic_partition_pruning.q,\
+  spark_dynamic_partition_pruning_2.q,\
+  spark_vectorized_dynamic_partition_pruning.q
+
+miniSparkOnYarn.query.files=auto_sortmerge_join_16.q,\
+  bucket4.q,\
+  bucket5.q,\
+  bucket6.q,\
+  bucketizedhiveinputformat.q,\
+  bucketmapjoin6.q,\
+  bucketmapjoin7.q,\
+  constprog_partitioner.q,\
+  disable_merge_for_bucketing.q,\
+  empty_dir_in_table.q,\
+  external_table_with_space_in_location_path.q,\
+  file_with_header_footer.q,\
+  import_exported_table.q,\
+  index_bitmap3.q,\
+  index_bitmap_auto.q,\
+  infer_bucket_sort_bucketed_table.q,\
+  infer_bucket_sort_map_operators.q,\
+  infer_bucket_sort_merge.q,\
+  infer_bucket_sort_num_buckets.q,\
+  infer_bucket_sort_reducers_power_two.q,\
+  input16_cc.q,\
+  leftsemijoin_mr.q,\
+  list_bucket_dml_10.q,\
+  load_fs2.q,\
+  load_hdfs_file_with_space_in_the_name.q,\
+  optrstat_groupby.q,\
+  orc_merge1.q,\
+  orc_merge2.q,\
+  orc_merge3.q,\
+  orc_merge4.q,\
+  orc_merge5.q,\
+  orc_merge6.q,\
+  orc_merge7.q,\
+  orc_merge8.q,\
+  orc_merge9.q,\
+  orc_merge_diff_fs.q,\
+  orc_merge_incompat1.q,\
+  orc_merge_incompat2.q,\
+  parallel_orderby.q,\
+  ql_rewrite_gbtoidx.q,\
+  ql_rewrite_gbtoidx_cbo_1.q,\
+  quotedid_smb.q,\
+  reduce_deduplicate.q,\
+  remote_script.q,\
+  root_dir_external_table.q,\
+  schemeAuthority.q,\
+  schemeAuthority2.q,\
+  scriptfile1.q,\
+  scriptfile1_win.q,\
+  smb_mapjoin_8.q,\
+  stats_counter.q,\
+  stats_counter_partitioned.q,\
+  temp_table_external.q,\
+  truncate_column_buckets.q,\
+  uber_reduce.q,\
+  vector_inner_join.q,\
+  vector_outer_join0.q,\
+  vector_outer_join1.q,\
+  vector_outer_join2.q,\
+  vector_outer_join3.q,\
+  vector_outer_join4.q,\
+  vector_outer_join5.q
+
+spark.query.negative.files=groupby2_map_skew_multi_distinct.q

http://git-wip-us.apache.org/repos/asf/hive/blob/26728a8a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java
----------------------------------------------------------------------
diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java
index af43a07..3d7e4f0 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java
@@ -662,12 +662,21 @@ public class VectorizedBatchUtil {
   public static void debugDisplayOneRow(VectorizedRowBatch batch, int index, String prefix) {
     StringBuilder sb = new StringBuilder();
     sb.append(prefix + " row " + index + " ");
-    for (int column = 0; column < batch.cols.length; column++) {
+    for (int p = 0; p < batch.projectionSize; p++) {
+      int column = batch.projectedColumns[p];
+      if (p == column) {
+        sb.append("(col " + p + ") ");
+      } else {
+        sb.append("(proj col " + p + " col " + column + ") ");
+      }
       ColumnVector colVector = batch.cols[column];
       if (colVector == null) {
-        sb.append("(null colVector " + column + ")");
+        sb.append("(null ColumnVector)");
       } else {
         boolean isRepeating = colVector.isRepeating;
+        if (isRepeating) {
+          sb.append("(repeating)");
+        }
         index = (isRepeating ? 0 : index);
         if (colVector.noNulls || !colVector.isNull[index]) {
           if (colVector instanceof LongColumnVector) {

http://git-wip-us.apache.org/repos/asf/hive/blob/26728a8a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java.orig
----------------------------------------------------------------------
diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java.orig b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java.orig
new file mode 100644
index 0000000..af43a07
--- /dev/null
+++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java.orig
@@ -0,0 +1,707 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.vector;
+
+import java.io.IOException;
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.hive.common.ObjectPair;
+import org.apache.hadoop.hive.common.type.HiveChar;
+import org.apache.hadoop.hive.common.type.HiveIntervalDayTime;
+import org.apache.hadoop.hive.common.type.HiveIntervalYearMonth;
+import org.apache.hadoop.hive.common.type.HiveVarchar;
+import org.apache.hadoop.hive.ql.exec.Utilities;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.serde2.io.ByteWritable;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.DoubleWritable;
+import org.apache.hadoop.hive.serde2.io.HiveCharWritable;
+import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable;
+import org.apache.hadoop.hive.serde2.io.HiveIntervalDayTimeWritable;
+import org.apache.hadoop.hive.serde2.io.HiveIntervalYearMonthWritable;
+import org.apache.hadoop.hive.serde2.io.HiveVarcharWritable;
+import org.apache.hadoop.hive.serde2.io.ShortWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
+import org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils;
+import org.apache.hadoop.io.BooleanWritable;
+import org.apache.hadoop.io.BytesWritable;
+import org.apache.hadoop.io.DataOutputBuffer;
+import org.apache.hadoop.io.FloatWritable;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hive.common.util.DateUtils;
+
+public class VectorizedBatchUtil {
+  private static final Log LOG = LogFactory.getLog(VectorizedBatchUtil.class);
+
+  /**
+   * Sets the IsNull value for ColumnVector at specified index
+   * @param cv
+   * @param rowIndex
+   */
+  public static void setNullColIsNullValue(ColumnVector cv, int rowIndex) {
+    cv.isNull[rowIndex] = true;
+    if (cv.noNulls) {
+      cv.noNulls = false;
+    }
+  }
+
+  /**
+   * Iterates thru all the column vectors and sets noNull to
+   * specified value.
+   *
+   * @param batch
+   *          Batch on which noNull is set
+   */
+  public static void setNoNullFields(VectorizedRowBatch batch) {
+    for (int i = 0; i < batch.numCols; i++) {
+      batch.cols[i].noNulls = true;
+    }
+  }
+
+  /**
+   * Iterates thru all the column vectors and sets repeating to
+   * specified column.
+   *
+   */
+  public static void setRepeatingColumn(VectorizedRowBatch batch, int column) {
+    ColumnVector cv = batch.cols[column];
+    cv.isRepeating = true;
+  }
+
+  /**
+   * Reduce the batch size for a vectorized row batch
+   */
+  public static void setBatchSize(VectorizedRowBatch batch, int size) {
+    assert (size <= batch.getMaxSize());
+    batch.size = size;
+  }
+
+  /**
+   * Walk through the object inspector and add column vectors
+   *
+   * @param oi
+   * @param cvList
+   *          ColumnVectors are populated in this list
+   */
+  private static void allocateColumnVector(StructObjectInspector oi,
+      List<ColumnVector> cvList) throws HiveException {
+    if (cvList == null) {
+      throw new HiveException("Null columnvector list");
+    }
+    if (oi == null) {
+      return;
+    }
+    final List<? extends StructField> fields = oi.getAllStructFieldRefs();
+    for(StructField field : fields) {
+      ObjectInspector fieldObjectInspector = field.getFieldObjectInspector();
+      switch(fieldObjectInspector.getCategory()) {
+      case PRIMITIVE:
+        PrimitiveObjectInspector poi = (PrimitiveObjectInspector) fieldObjectInspector;
+        switch(poi.getPrimitiveCategory()) {
+        case BOOLEAN:
+        case BYTE:
+        case SHORT:
+        case INT:
+        case LONG:
+        case TIMESTAMP:
+        case DATE:
+        case INTERVAL_YEAR_MONTH:
+        case INTERVAL_DAY_TIME:
+          cvList.add(new LongColumnVector(VectorizedRowBatch.DEFAULT_SIZE));
+          break;
+        case FLOAT:
+        case DOUBLE:
+          cvList.add(new DoubleColumnVector(VectorizedRowBatch.DEFAULT_SIZE));
+          break;
+        case BINARY:
+        case STRING:
+        case CHAR:
+        case VARCHAR:
+          cvList.add(new BytesColumnVector(VectorizedRowBatch.DEFAULT_SIZE));
+          break;
+        case DECIMAL:
+          DecimalTypeInfo tInfo = (DecimalTypeInfo) poi.getTypeInfo();
+          cvList.add(new DecimalColumnVector(VectorizedRowBatch.DEFAULT_SIZE,
+              tInfo.precision(), tInfo.scale()));
+          break;
+        default:
+          throw new HiveException("Vectorizaton is not supported for datatype:"
+              + poi.getPrimitiveCategory());
+        }
+        break;
+      case STRUCT:
+        throw new HiveException("Struct not supported");
+      default:
+        throw new HiveException("Flattening is not supported for datatype:"
+            + fieldObjectInspector.getCategory());
+      }
+    }
+  }
+
+
+  /**
+   * Create VectorizedRowBatch from ObjectInspector
+   *
+   * @param oi
+   * @return
+   * @throws HiveException
+   */
+  public static VectorizedRowBatch constructVectorizedRowBatch(
+      StructObjectInspector oi) throws HiveException {
+    final List<ColumnVector> cvList = new LinkedList<ColumnVector>();
+    allocateColumnVector(oi, cvList);
+    final VectorizedRowBatch result = new VectorizedRowBatch(cvList.size());
+    int i = 0;
+    for(ColumnVector cv : cvList) {
+      result.cols[i++] = cv;
+    }
+    return result;
+  }
+
+  /**
+   * Create VectorizedRowBatch from key and value object inspectors
+   * The row object inspector used by ReduceWork needs to be a **standard**
+   * struct object inspector, not just any struct object inspector.
+   * @param keyInspector
+   * @param valueInspector
+   * @param vectorScratchColumnTypeMap
+   * @return VectorizedRowBatch, OI
+   * @throws HiveException
+   */
+  public static ObjectPair<VectorizedRowBatch, StandardStructObjectInspector> constructVectorizedRowBatch(
+      StructObjectInspector keyInspector, StructObjectInspector valueInspector, Map<Integer, String> vectorScratchColumnTypeMap)
+          throws HiveException {
+
+    ArrayList<String> colNames = new ArrayList<String>();
+    ArrayList<ObjectInspector> ois = new ArrayList<ObjectInspector>();
+    List<? extends StructField> fields = keyInspector.getAllStructFieldRefs();
+    for (StructField field: fields) {
+      colNames.add(Utilities.ReduceField.KEY.toString() + "." + field.getFieldName());
+      ois.add(field.getFieldObjectInspector());
+    }
+    fields = valueInspector.getAllStructFieldRefs();
+    for (StructField field: fields) {
+      colNames.add(Utilities.ReduceField.VALUE.toString() + "." + field.getFieldName());
+      ois.add(field.getFieldObjectInspector());
+    }
+    StandardStructObjectInspector rowObjectInspector = ObjectInspectorFactory.getStandardStructObjectInspector(colNames, ois);
+
+    VectorizedRowBatchCtx batchContext = new VectorizedRowBatchCtx();
+    batchContext.init(vectorScratchColumnTypeMap, rowObjectInspector);
+    return new ObjectPair<>(batchContext.createVectorizedRowBatch(), rowObjectInspector);
+  }
+
+  /**
+   * Iterates through all columns in a given row and populates the batch
+   *
+   * @param row
+   * @param oi
+   * @param rowIndex
+   * @param batch
+   * @param buffer
+   * @throws HiveException
+   */
+  public static void addRowToBatch(Object row, StructObjectInspector oi,
+          int rowIndex,
+          VectorizedRowBatch batch,
+          DataOutputBuffer buffer
+          ) throws HiveException {
+    addRowToBatchFrom(row, oi, rowIndex, 0, batch, buffer);
+  }
+
+  /**
+   * Iterates thru all the columns in a given row and populates the batch
+   * from a given offset
+   *
+   * @param row Deserialized row object
+   * @param oi Object insepector for that row
+   * @param rowIndex index to which the row should be added to batch
+   * @param colOffset offset from where the column begins
+   * @param batch Vectorized batch to which the row is added at rowIndex
+   * @throws HiveException
+   */
+  public static void addRowToBatchFrom(Object row, StructObjectInspector oi,
+                                   int rowIndex,
+                                   int colOffset,
+                                   VectorizedRowBatch batch,
+                                   DataOutputBuffer buffer
+                                   ) throws HiveException {
+    List<? extends StructField> fieldRefs = oi.getAllStructFieldRefs();
+    final int off = colOffset;
+    // Iterate thru the cols and load the batch
+    for (int i = 0; i < fieldRefs.size(); i++) {
+      setVector(row, oi, fieldRefs.get(i), batch, buffer, rowIndex, i, off);
+    }
+  }
+
+  /**
+   * Add only the projected column of a regular row to the specified vectorized row batch
+   * @param row the regular row
+   * @param oi object inspector for the row
+   * @param rowIndex the offset to add in the batch
+   * @param batch vectorized row batch
+   * @param buffer data output buffer
+   * @throws HiveException
+   */
+  public static void addProjectedRowToBatchFrom(Object row, StructObjectInspector oi,
+      int rowIndex, VectorizedRowBatch batch, DataOutputBuffer buffer) throws HiveException {
+    List<? extends StructField> fieldRefs = oi.getAllStructFieldRefs();
+    for (int i = 0; i < fieldRefs.size(); i++) {
+      int projectedOutputCol = batch.projectedColumns[i];
+      if (batch.cols[projectedOutputCol] == null) {
+        continue;
+      }
+      setVector(row, oi, fieldRefs.get(i), batch, buffer, rowIndex, projectedOutputCol, 0);
+    }
+  }
+  /**
+   * Iterates thru all the columns in a given row and populates the batch
+   * from a given offset
+   *
+   * @param row Deserialized row object
+   * @param oi Object insepector for that row
+   * @param rowIndex index to which the row should be added to batch
+   * @param batch Vectorized batch to which the row is added at rowIndex
+   * @param context context object for this vectorized batch
+   * @param buffer
+   * @throws HiveException
+   */
+  public static void acidAddRowToBatch(Object row,
+                                       StructObjectInspector oi,
+                                       int rowIndex,
+                                       VectorizedRowBatch batch,
+                                       VectorizedRowBatchCtx context,
+                                       DataOutputBuffer buffer) throws HiveException {
+    List<? extends StructField> fieldRefs = oi.getAllStructFieldRefs();
+    // Iterate thru the cols and load the batch
+    for (int i = 0; i < fieldRefs.size(); i++) {
+      if (batch.cols[i] == null) {
+        // This means the column was not included in the projection from the underlying read
+        continue;
+      }
+      if (context.isPartitionCol(i)) {
+        // The value will have already been set before we're called, so don't overwrite it
+        continue;
+      }
+      setVector(row, oi, fieldRefs.get(i), batch, buffer, rowIndex, i, 0);
+    }
+  }
+
+  private static void setVector(Object row,
+                                StructObjectInspector oi,
+                                StructField field,
+                                VectorizedRowBatch batch,
+                                DataOutputBuffer buffer,
+                                int rowIndex,
+                                int colIndex,
+                                int offset) throws HiveException {
+
+    Object fieldData = oi.getStructFieldData(row, field);
+    ObjectInspector foi = field.getFieldObjectInspector();
+
+    // Vectorization only supports PRIMITIVE data types. Assert the same
+    assert (foi.getCategory() == Category.PRIMITIVE);
+
+    // Get writable object
+    PrimitiveObjectInspector poi = (PrimitiveObjectInspector) foi;
+    Object writableCol = poi.getPrimitiveWritableObject(fieldData);
+
+    // NOTE: The default value for null fields in vectorization is 1 for int types, NaN for
+    // float/double. String types have no default value for null.
+    switch (poi.getPrimitiveCategory()) {
+    case BOOLEAN: {
+      LongColumnVector lcv = (LongColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        lcv.vector[rowIndex] = ((BooleanWritable) writableCol).get() ? 1 : 0;
+        lcv.isNull[rowIndex] = false;
+      } else {
+        lcv.vector[rowIndex] = 1;
+        setNullColIsNullValue(lcv, rowIndex);
+      }
+    }
+      break;
+    case BYTE: {
+      LongColumnVector lcv = (LongColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        lcv.vector[rowIndex] = ((ByteWritable) writableCol).get();
+        lcv.isNull[rowIndex] = false;
+      } else {
+        lcv.vector[rowIndex] = 1;
+        setNullColIsNullValue(lcv, rowIndex);
+      }
+    }
+      break;
+    case SHORT: {
+      LongColumnVector lcv = (LongColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        lcv.vector[rowIndex] = ((ShortWritable) writableCol).get();
+        lcv.isNull[rowIndex] = false;
+      } else {
+        lcv.vector[rowIndex] = 1;
+        setNullColIsNullValue(lcv, rowIndex);
+      }
+    }
+      break;
+    case INT: {
+      LongColumnVector lcv = (LongColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        lcv.vector[rowIndex] = ((IntWritable) writableCol).get();
+        lcv.isNull[rowIndex] = false;
+      } else {
+        lcv.vector[rowIndex] = 1;
+        setNullColIsNullValue(lcv, rowIndex);
+      }
+    }
+      break;
+    case LONG: {
+      LongColumnVector lcv = (LongColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        lcv.vector[rowIndex] = ((LongWritable) writableCol).get();
+        lcv.isNull[rowIndex] = false;
+      } else {
+        lcv.vector[rowIndex] = 1;
+        setNullColIsNullValue(lcv, rowIndex);
+      }
+    }
+      break;
+    case DATE: {
+      LongColumnVector lcv = (LongColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        lcv.vector[rowIndex] = ((DateWritable) writableCol).getDays();
+        lcv.isNull[rowIndex] = false;
+      } else {
+        lcv.vector[rowIndex] = 1;
+        setNullColIsNullValue(lcv, rowIndex);
+      }
+    }
+      break;
+    case FLOAT: {
+      DoubleColumnVector dcv = (DoubleColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        dcv.vector[rowIndex] = ((FloatWritable) writableCol).get();
+        dcv.isNull[rowIndex] = false;
+      } else {
+        dcv.vector[rowIndex] = Double.NaN;
+        setNullColIsNullValue(dcv, rowIndex);
+      }
+    }
+      break;
+    case DOUBLE: {
+      DoubleColumnVector dcv = (DoubleColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        dcv.vector[rowIndex] = ((DoubleWritable) writableCol).get();
+        dcv.isNull[rowIndex] = false;
+      } else {
+        dcv.vector[rowIndex] = Double.NaN;
+        setNullColIsNullValue(dcv, rowIndex);
+      }
+    }
+      break;
+    case TIMESTAMP: {
+      LongColumnVector lcv = (LongColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        Timestamp t = ((TimestampWritable) writableCol).getTimestamp();
+        lcv.vector[rowIndex] = TimestampUtils.getTimeNanoSec(t);
+        lcv.isNull[rowIndex] = false;
+      } else {
+        lcv.vector[rowIndex] = 1;
+        setNullColIsNullValue(lcv, rowIndex);
+      }
+    }
+      break;
+    case INTERVAL_YEAR_MONTH: {
+      LongColumnVector lcv = (LongColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        HiveIntervalYearMonth i = ((HiveIntervalYearMonthWritable) writableCol).getHiveIntervalYearMonth();
+        lcv.vector[rowIndex] = i.getTotalMonths();
+        lcv.isNull[rowIndex] = false;
+      } else {
+        lcv.vector[rowIndex] = 1;
+        setNullColIsNullValue(lcv, rowIndex);
+      }
+    }
+      break;
+    case INTERVAL_DAY_TIME: {
+      LongColumnVector lcv = (LongColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        HiveIntervalDayTime i = ((HiveIntervalDayTimeWritable) writableCol).getHiveIntervalDayTime();
+        lcv.vector[rowIndex] = DateUtils.getIntervalDayTimeTotalNanos(i);
+        lcv.isNull[rowIndex] = false;
+      } else {
+        lcv.vector[rowIndex] = 1;
+        setNullColIsNullValue(lcv, rowIndex);
+      }
+    }
+      break;
+    case BINARY: {
+      BytesColumnVector bcv = (BytesColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+          bcv.isNull[rowIndex] = false;
+          BytesWritable bw = (BytesWritable) writableCol;
+          byte[] bytes = bw.getBytes();
+          int start = buffer.getLength();
+          int length = bw.getLength();
+          try {
+            buffer.write(bytes, 0, length);
+          } catch (IOException ioe) {
+            throw new IllegalStateException("bad write", ioe);
+          }
+          bcv.setRef(rowIndex, buffer.getData(), start, length);
+      } else {
+        setNullColIsNullValue(bcv, rowIndex);
+      }
+    }
+      break;
+    case STRING: {
+      BytesColumnVector bcv = (BytesColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        bcv.isNull[rowIndex] = false;
+        Text colText = (Text) writableCol;
+        int start = buffer.getLength();
+        int length = colText.getLength();
+        try {
+          buffer.write(colText.getBytes(), 0, length);
+        } catch (IOException ioe) {
+          throw new IllegalStateException("bad write", ioe);
+        }
+        bcv.setRef(rowIndex, buffer.getData(), start, length);
+      } else {
+        setNullColIsNullValue(bcv, rowIndex);
+      }
+    }
+      break;
+    case CHAR: {
+      BytesColumnVector bcv = (BytesColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        bcv.isNull[rowIndex] = false;
+        HiveChar colHiveChar = ((HiveCharWritable) writableCol).getHiveChar();
+        byte[] bytes = colHiveChar.getStrippedValue().getBytes();
+
+        // We assume the CHAR maximum length was enforced when the object was created.
+        int length = bytes.length;
+
+        int start = buffer.getLength();
+        try {
+          // In vector mode, we store CHAR as unpadded.
+          buffer.write(bytes, 0, length);
+        } catch (IOException ioe) {
+          throw new IllegalStateException("bad write", ioe);
+        }
+        bcv.setRef(rowIndex, buffer.getData(), start, length);
+      } else {
+        setNullColIsNullValue(bcv, rowIndex);
+      }
+    }
+      break;
+    case VARCHAR: {
+        BytesColumnVector bcv = (BytesColumnVector) batch.cols[offset + colIndex];
+        if (writableCol != null) {
+          bcv.isNull[rowIndex] = false;
+          HiveVarchar colHiveVarchar = ((HiveVarcharWritable) writableCol).getHiveVarchar();
+          byte[] bytes = colHiveVarchar.getValue().getBytes();
+
+          // We assume the VARCHAR maximum length was enforced when the object was created.
+          int length = bytes.length;
+
+          int start = buffer.getLength();
+          try {
+            buffer.write(bytes, 0, length);
+          } catch (IOException ioe) {
+            throw new IllegalStateException("bad write", ioe);
+          }
+          bcv.setRef(rowIndex, buffer.getData(), start, length);
+        } else {
+          setNullColIsNullValue(bcv, rowIndex);
+        }
+      }
+        break;
+    case DECIMAL:
+      DecimalColumnVector dcv = (DecimalColumnVector) batch.cols[offset + colIndex];
+      if (writableCol != null) {
+        dcv.isNull[rowIndex] = false;
+        HiveDecimalWritable wobj = (HiveDecimalWritable) writableCol;
+        dcv.set(rowIndex, wobj);
+      } else {
+        setNullColIsNullValue(dcv, rowIndex);
+      }
+      break;
+    default:
+      throw new HiveException("Vectorizaton is not supported for datatype:" +
+          poi.getPrimitiveCategory());
+    }
+  }
+
+  public static StandardStructObjectInspector convertToStandardStructObjectInspector(
+      StructObjectInspector structObjectInspector) throws HiveException {
+
+    List<? extends StructField> fields = structObjectInspector.getAllStructFieldRefs();
+    List<ObjectInspector> oids = new ArrayList<ObjectInspector>();
+    ArrayList<String> columnNames = new ArrayList<String>();
+
+    for(StructField field : fields) {
+      TypeInfo typeInfo = TypeInfoUtils.getTypeInfoFromTypeString(
+          field.getFieldObjectInspector().getTypeName());
+      ObjectInspector standardWritableObjectInspector =
+              TypeInfoUtils.getStandardWritableObjectInspectorFromTypeInfo(typeInfo);
+      oids.add(standardWritableObjectInspector);
+      columnNames.add(field.getFieldName());
+    }
+    return ObjectInspectorFactory.getStandardStructObjectInspector(columnNames,oids);
+  }
+
+  public static PrimitiveTypeInfo[] primitiveTypeInfosFromStructObjectInspector(
+      StructObjectInspector structObjectInspector) throws HiveException {
+
+    List<? extends StructField> fields = structObjectInspector.getAllStructFieldRefs();
+    PrimitiveTypeInfo[] result = new PrimitiveTypeInfo[fields.size()];
+
+    int i = 0;
+    for(StructField field : fields) {
+      TypeInfo typeInfo = TypeInfoUtils.getTypeInfoFromTypeString(
+          field.getFieldObjectInspector().getTypeName());
+      result[i++] =  (PrimitiveTypeInfo) typeInfo;
+    }
+    return result;
+  }
+
+  public static PrimitiveTypeInfo[] primitiveTypeInfosFromTypeNames(
+      String[] typeNames) throws HiveException {
+
+    PrimitiveTypeInfo[] result = new PrimitiveTypeInfo[typeNames.length];
+
+    for(int i = 0; i < typeNames.length; i++) {
+      TypeInfo typeInfo = TypeInfoUtils.getTypeInfoFromTypeString(typeNames[i]);
+      result[i] =  (PrimitiveTypeInfo) typeInfo;
+    }
+    return result;
+  }
+
+  /**
+   * Make a new (scratch) batch, which is exactly "like" the batch provided, except that it's empty
+   * @param batch the batch to imitate
+   * @return the new batch
+   * @throws HiveException
+   */
+  public static VectorizedRowBatch makeLike(VectorizedRowBatch batch) throws HiveException {
+    VectorizedRowBatch newBatch = new VectorizedRowBatch(batch.numCols);
+    for (int i = 0; i < batch.numCols; i++) {
+      ColumnVector colVector = batch.cols[i];
+      if (colVector != null) {
+        ColumnVector newColVector;
+        if (colVector instanceof LongColumnVector) {
+          newColVector = new LongColumnVector();
+        } else if (colVector instanceof DoubleColumnVector) {
+          newColVector = new DoubleColumnVector();
+        } else if (colVector instanceof BytesColumnVector) {
+          newColVector = new BytesColumnVector();
+        } else if (colVector instanceof DecimalColumnVector) {
+          DecimalColumnVector decColVector = (DecimalColumnVector) colVector;
+          newColVector = new DecimalColumnVector(decColVector.precision, decColVector.scale);
+        } else {
+          throw new HiveException("Column vector class " + colVector.getClass().getName() +
+          " is not supported!");
+        }
+        newBatch.cols[i] = newColVector;
+        newBatch.cols[i].init();
+      }
+    }
+    newBatch.projectedColumns = Arrays.copyOf(batch.projectedColumns, batch.projectedColumns.length);
+    newBatch.projectionSize = batch.projectionSize;
+    newBatch.reset();
+    return newBatch;
+  }
+
+  public static String displayBytes(byte[] bytes, int start, int length) {
+    StringBuilder sb = new StringBuilder();
+    for (int i = start; i < start + length; i++) {
+      char ch = (char) bytes[i];
+      if (ch < ' ' || ch > '~') {
+        sb.append(String.format("\\%03d", bytes[i] & 0xff));
+      } else {
+        sb.append(ch);
+      }
+    }
+    return sb.toString();
+  }
+
+  public static void debugDisplayOneRow(VectorizedRowBatch batch, int index, String prefix) {
+    StringBuilder sb = new StringBuilder();
+    sb.append(prefix + " row " + index + " ");
+    for (int column = 0; column < batch.cols.length; column++) {
+      ColumnVector colVector = batch.cols[column];
+      if (colVector == null) {
+        sb.append("(null colVector " + column + ")");
+      } else {
+        boolean isRepeating = colVector.isRepeating;
+        index = (isRepeating ? 0 : index);
+        if (colVector.noNulls || !colVector.isNull[index]) {
+          if (colVector instanceof LongColumnVector) {
+            sb.append(((LongColumnVector) colVector).vector[index]);
+          } else if (colVector instanceof DoubleColumnVector) {
+            sb.append(((DoubleColumnVector) colVector).vector[index]);
+          } else if (colVector instanceof BytesColumnVector) {
+            BytesColumnVector bytesColumnVector = (BytesColumnVector) colVector;
+            byte[] bytes = bytesColumnVector.vector[index];
+            int start = bytesColumnVector.start[index];
+            int length = bytesColumnVector.length[index];
+            if (bytes == null) {
+              sb.append("(Unexpected null bytes with start " + start + " length " + length + ")");
+            } else {
+              sb.append("bytes: '" + displayBytes(bytes, start, length) + "'");
+            }
+          } else if (colVector instanceof DecimalColumnVector) {
+            sb.append(((DecimalColumnVector) colVector).vector[index].toString());
+          } else {
+            sb.append("Unknown");
+          }
+        } else {
+          sb.append("NULL");
+        }
+      }
+      sb.append(" ");
+    }
+    LOG.info(sb.toString());
+  }
+
+  public static void debugDisplayBatch(VectorizedRowBatch batch, String prefix) {
+    for (int i = 0; i < batch.size; i++) {
+      int index = (batch.selectedInUse ? batch.selected[i] : i);
+      debugDisplayOneRow(batch, index, prefix);
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/hive/blob/26728a8a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/udf/VectorUDFArgDesc.java
----------------------------------------------------------------------
diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/udf/VectorUDFArgDesc.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/udf/VectorUDFArgDesc.java
index e113980..749ddea 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/udf/VectorUDFArgDesc.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/udf/VectorUDFArgDesc.java
@@ -27,6 +27,7 @@ import org.apache.hadoop.hive.ql.metadata.HiveException;
 import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc;
 import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
 import org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredObject;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category;
 import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector.PrimitiveCategory;
 import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
 import org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo;
@@ -52,6 +53,17 @@ public class VectorUDFArgDesc implements Serializable {
    */
   public void setConstant(ExprNodeConstantDesc expr) {
     isConstant = true;
+    if (expr != null) {
+      if (expr.getTypeInfo().getCategory() == Category.PRIMITIVE) {
+        PrimitiveCategory primitiveCategory = ((PrimitiveTypeInfo) expr.getTypeInfo())
+            .getPrimitiveCategory();
+        if (primitiveCategory == PrimitiveCategory.VOID) {
+          // Otherwise we'd create a NullWritable and that isn't what we want.
+          expr = null;
+        }
+      }
+    }
+
     constExpr = expr;
   }
 

http://git-wip-us.apache.org/repos/asf/hive/blob/26728a8a/ql/src/test/queries/clientpositive/vector_when_case_null.q
----------------------------------------------------------------------
diff --git a/ql/src/test/queries/clientpositive/vector_when_case_null.q b/ql/src/test/queries/clientpositive/vector_when_case_null.q
new file mode 100644
index 0000000..a423b60
--- /dev/null
+++ b/ql/src/test/queries/clientpositive/vector_when_case_null.q
@@ -0,0 +1,14 @@
+set hive.explain.user=false;
+SET hive.vectorized.execution.enabled=true;
+SET hive.auto.convert.join=true;
+set hive.fetch.task.conversion=none;
+
+-- SORT_QUERY_RESULTS
+
+create table count_case_groupby (key string, bool boolean) STORED AS orc;
+insert into table count_case_groupby values ('key1', true),('key2', false),('key3', NULL),('key4', false),('key5',NULL);
+
+explain
+SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
+
+SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/hive/blob/26728a8a/ql/src/test/results/clientpositive/tez/vector_select_null2.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/tez/vector_select_null2.q.out b/ql/src/test/results/clientpositive/tez/vector_select_null2.q.out
new file mode 100644
index 0000000..f9dad4e
--- /dev/null
+++ b/ql/src/test/results/clientpositive/tez/vector_select_null2.q.out
@@ -0,0 +1,95 @@
+PREHOOK: query: -- SORT_QUERY_RESULTS
+
+create table count_case_groupby (key string, bool boolean) STORED AS orc
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@count_case_groupby
+POSTHOOK: query: -- SORT_QUERY_RESULTS
+
+create table count_case_groupby (key string, bool boolean) STORED AS orc
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@count_case_groupby
+PREHOOK: query: insert into table count_case_groupby values ('key1', true),('key2', false),('key3', NULL),('key4', false),('key5',NULL)
+PREHOOK: type: QUERY
+PREHOOK: Input: default@values__tmp__table__1
+PREHOOK: Output: default@count_case_groupby
+POSTHOOK: query: insert into table count_case_groupby values ('key1', true),('key2', false),('key3', NULL),('key4', false),('key5',NULL)
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@values__tmp__table__1
+POSTHOOK: Output: default@count_case_groupby
+POSTHOOK: Lineage: count_case_groupby.bool EXPRESSION [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col2, type:string, comment:), ]
+POSTHOOK: Lineage: count_case_groupby.key SIMPLE [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col1, type:string, comment:), ]
+PREHOOK: query: explain
+SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+PREHOOK: type: QUERY
+POSTHOOK: query: explain
+SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Tez
+      Edges:
+        Reducer 2 <- Map 1 (SIMPLE_EDGE)
+#### A masked pattern was here ####
+      Vertices:
+        Map 1 
+            Map Operator Tree:
+                TableScan
+                  alias: count_case_groupby
+                  Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                  Select Operator
+                    expressions: key (type: string), CASE WHEN (bool) THEN (1) WHEN ((not bool)) THEN (0) ELSE (null) END (type: int)
+                    outputColumnNames: _col0, _col1
+                    Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                    Group By Operator
+                      aggregations: count(_col1)
+                      keys: _col0 (type: string)
+                      mode: hash
+                      outputColumnNames: _col0, _col1
+                      Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                      Reduce Output Operator
+                        key expressions: _col0 (type: string)
+                        sort order: +
+                        Map-reduce partition columns: _col0 (type: string)
+                        Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                        value expressions: _col1 (type: bigint)
+        Reducer 2 
+            Reduce Operator Tree:
+              Group By Operator
+                aggregations: count(VALUE._col0)
+                keys: KEY._col0 (type: string)
+                mode: mergepartial
+                outputColumnNames: _col0, _col1
+                Statistics: Num rows: 2 Data size: 180 Basic stats: COMPLETE Column stats: NONE
+                File Output Operator
+                  compressed: false
+                  Statistics: Num rows: 2 Data size: 180 Basic stats: COMPLETE Column stats: NONE
+                  table:
+                      input format: org.apache.hadoop.mapred.TextInputFormat
+                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+PREHOOK: type: QUERY
+PREHOOK: Input: default@count_case_groupby
+#### A masked pattern was here ####
+POSTHOOK: query: SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@count_case_groupby
+#### A masked pattern was here ####
+key1	1
+key2	1
+key3	0
+key4	1
+key5	0

http://git-wip-us.apache.org/repos/asf/hive/blob/26728a8a/ql/src/test/results/clientpositive/tez/vector_when_case_null.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/tez/vector_when_case_null.q.out b/ql/src/test/results/clientpositive/tez/vector_when_case_null.q.out
new file mode 100644
index 0000000..07a9659
--- /dev/null
+++ b/ql/src/test/results/clientpositive/tez/vector_when_case_null.q.out
@@ -0,0 +1,96 @@
+PREHOOK: query: -- SORT_QUERY_RESULTS
+
+create table count_case_groupby (key string, bool boolean) STORED AS orc
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@count_case_groupby
+POSTHOOK: query: -- SORT_QUERY_RESULTS
+
+create table count_case_groupby (key string, bool boolean) STORED AS orc
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@count_case_groupby
+PREHOOK: query: insert into table count_case_groupby values ('key1', true),('key2', false),('key3', NULL),('key4', false),('key5',NULL)
+PREHOOK: type: QUERY
+PREHOOK: Input: default@values__tmp__table__1
+PREHOOK: Output: default@count_case_groupby
+POSTHOOK: query: insert into table count_case_groupby values ('key1', true),('key2', false),('key3', NULL),('key4', false),('key5',NULL)
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@values__tmp__table__1
+POSTHOOK: Output: default@count_case_groupby
+POSTHOOK: Lineage: count_case_groupby.bool EXPRESSION [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col2, type:string, comment:), ]
+POSTHOOK: Lineage: count_case_groupby.key SIMPLE [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col1, type:string, comment:), ]
+PREHOOK: query: explain
+SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+PREHOOK: type: QUERY
+POSTHOOK: query: explain
+SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Tez
+      Edges:
+        Reducer 2 <- Map 1 (SIMPLE_EDGE)
+#### A masked pattern was here ####
+      Vertices:
+        Map 1 
+            Map Operator Tree:
+                TableScan
+                  alias: count_case_groupby
+                  Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                  Select Operator
+                    expressions: key (type: string), CASE WHEN (bool) THEN (1) WHEN ((not bool)) THEN (0) ELSE (null) END (type: int)
+                    outputColumnNames: _col0, _col1
+                    Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                    Group By Operator
+                      aggregations: count(_col1)
+                      keys: _col0 (type: string)
+                      mode: hash
+                      outputColumnNames: _col0, _col1
+                      Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                      Reduce Output Operator
+                        key expressions: _col0 (type: string)
+                        sort order: +
+                        Map-reduce partition columns: _col0 (type: string)
+                        Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                        value expressions: _col1 (type: bigint)
+        Reducer 2 
+            Reduce Operator Tree:
+              Group By Operator
+                aggregations: count(VALUE._col0)
+                keys: KEY._col0 (type: string)
+                mode: mergepartial
+                outputColumnNames: _col0, _col1
+                Statistics: Num rows: 2 Data size: 180 Basic stats: COMPLETE Column stats: NONE
+                File Output Operator
+                  compressed: false
+                  Statistics: Num rows: 2 Data size: 180 Basic stats: COMPLETE Column stats: NONE
+                  table:
+                      input format: org.apache.hadoop.mapred.TextInputFormat
+                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+            Execution mode: vectorized
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+PREHOOK: type: QUERY
+PREHOOK: Input: default@count_case_groupby
+#### A masked pattern was here ####
+POSTHOOK: query: SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@count_case_groupby
+#### A masked pattern was here ####
+key1	1
+key2	1
+key3	0
+key4	1
+key5	0

http://git-wip-us.apache.org/repos/asf/hive/blob/26728a8a/ql/src/test/results/clientpositive/vector_when_case_null.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/vector_when_case_null.q.out b/ql/src/test/results/clientpositive/vector_when_case_null.q.out
new file mode 100644
index 0000000..16b2d33
--- /dev/null
+++ b/ql/src/test/results/clientpositive/vector_when_case_null.q.out
@@ -0,0 +1,89 @@
+PREHOOK: query: -- SORT_QUERY_RESULTS
+
+create table count_case_groupby (key string, bool boolean) STORED AS orc
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@count_case_groupby
+POSTHOOK: query: -- SORT_QUERY_RESULTS
+
+create table count_case_groupby (key string, bool boolean) STORED AS orc
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@count_case_groupby
+PREHOOK: query: insert into table count_case_groupby values ('key1', true),('key2', false),('key3', NULL),('key4', false),('key5',NULL)
+PREHOOK: type: QUERY
+PREHOOK: Input: default@values__tmp__table__1
+PREHOOK: Output: default@count_case_groupby
+POSTHOOK: query: insert into table count_case_groupby values ('key1', true),('key2', false),('key3', NULL),('key4', false),('key5',NULL)
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@values__tmp__table__1
+POSTHOOK: Output: default@count_case_groupby
+POSTHOOK: Lineage: count_case_groupby.bool EXPRESSION [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col2, type:string, comment:), ]
+POSTHOOK: Lineage: count_case_groupby.key SIMPLE [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col1, type:string, comment:), ]
+PREHOOK: query: explain
+SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+PREHOOK: type: QUERY
+POSTHOOK: query: explain
+SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: count_case_groupby
+            Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+            Select Operator
+              expressions: key (type: string), CASE WHEN (bool) THEN (1) WHEN ((not bool)) THEN (0) ELSE (null) END (type: int)
+              outputColumnNames: _col0, _col1
+              Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+              Group By Operator
+                aggregations: count(_col1)
+                keys: _col0 (type: string)
+                mode: hash
+                outputColumnNames: _col0, _col1
+                Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                Reduce Output Operator
+                  key expressions: _col0 (type: string)
+                  sort order: +
+                  Map-reduce partition columns: _col0 (type: string)
+                  Statistics: Num rows: 5 Data size: 452 Basic stats: COMPLETE Column stats: NONE
+                  value expressions: _col1 (type: bigint)
+      Reduce Operator Tree:
+        Group By Operator
+          aggregations: count(VALUE._col0)
+          keys: KEY._col0 (type: string)
+          mode: mergepartial
+          outputColumnNames: _col0, _col1
+          Statistics: Num rows: 2 Data size: 180 Basic stats: COMPLETE Column stats: NONE
+          File Output Operator
+            compressed: false
+            Statistics: Num rows: 2 Data size: 180 Basic stats: COMPLETE Column stats: NONE
+            table:
+                input format: org.apache.hadoop.mapred.TextInputFormat
+                output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+PREHOOK: type: QUERY
+PREHOOK: Input: default@count_case_groupby
+#### A masked pattern was here ####
+POSTHOOK: query: SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@count_case_groupby
+#### A masked pattern was here ####
+key1	1
+key2	1
+key3	0
+key4	1
+key5	0


Mime
View raw message