hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12348) Byte array comparison optimization for SIMD
Date Tue, 12 Apr 2016 03:45:25 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236536#comment-15236536
] 

Hive QA commented on HIVE-12348:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771988/HIVE-12348.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 9974 tests executed
*Failed tests:*
{noformat}
TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_part
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_skewjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_group_by
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_unionDistinct_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union_fast_stats
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_count_distinct
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_varchar_simple
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_part_project
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_rcfile_columnar
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dyn_part_max
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7552/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7552/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7552/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771988 - PreCommit-HIVE-TRUNK-Build

> Byte array comparison optimization for SIMD
> -------------------------------------------
>
>                 Key: HIVE-12348
>                 URL: https://issues.apache.org/jira/browse/HIVE-12348
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>            Priority: Minor
>         Attachments: HIVE-12348.patch
>
>
> The current byte comparison implementation in Hive is basic. It handles a byte (1 byte)
at a time, so it's slow.
> There's FastByteComparisons class in Hadoop with sun.misc.Unsafe class. (https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/io/FastByteComparisons.java)
It handles a long integer (8 bytes) at a time, so it's faster.
> Java 8 has String.compare, String.equalTo intrinsics with AVX2 and SSE4.2. (http://hg.openjdk.java.net/jdk8/jdk8/hotspot/rev/038dd2875b94)
It handles 128~256 bits (16~32 bytes) at a time, so it's much faster.
> However, Unsafe.getLong and String.compare intrinsic needs additional data copies, so
the actual performance increase is smaller than "1 byte : 32 bytes" comparison.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message