hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-22731) Probe MapJoin hashtables for row level filtering
Date Thu, 23 Jan 2020 03:52:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-22731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021742#comment-17021742
] 

Hive QA commented on HIVE-22731:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12991544/HIVE-22731.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/20287/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20287/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20287/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and
output '+ date '+%Y-%m-%d %T.%3N'
2020-01-23 03:50:44.414
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-20287/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2020-01-23 03:50:44.416
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 05cabc8 HIVE-22666: Introduce TopNKey operator for PTF Reduce Sink (Krisztian
Kasa, reviewed by Jesus Camacho Rodriguez)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 05cabc8 HIVE-22666: Introduce TopNKey operator for PTF Reduce Sink (Krisztian
Kasa, reviewed by Jesus Camacho Rodriguez)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2020-01-23 03:50:45.092
+ rm -rf ../yetus_PreCommit-HIVE-Build-20287
+ mkdir ../yetus_PreCommit-HIVE-Build-20287
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-20287
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-20287/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch
Trying to apply the patch with -p0
error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: does not exist in index
error: a/itests/src/test/resources/testconfiguration.properties: does not exist in index
error: a/llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java:
does not exist in index
error: a/llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/ColumnVectorProducer.java:
does not exist in index
error: a/llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java:
does not exist in index
error: a/llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcEncodedDataConsumer.java:
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java:
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastBytesHashTable.java:
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMap.java:
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMultiSet.java:
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashSet.java:
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java:
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/hashtable/VectorMapJoinHashTable.java:
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedHashTable.java:
does not exist in index
Trying to apply the patch with -p1
error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java:19
Falling back to three-way merge...
Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java'
with conflicts.
Going to apply patch with: git apply -p1
/data/hiveptest/working/scratch/build.patch:825: trailing whitespace.
        Map 1 
/data/hiveptest/working/scratch/build.patch:905: trailing whitespace.
        Map 2 
error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java:19
Falling back to three-way merge...
Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java'
with conflicts.
U ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java
warning: 2 lines add whitespace errors.
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-20287
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12991544 - PreCommit-HIVE-Build

> Probe MapJoin hashtables for row level filtering
> ------------------------------------------------
>
>                 Key: HIVE-22731
>                 URL: https://issues.apache.org/jira/browse/HIVE-22731
>             Project: Hive
>          Issue Type: Improvement
>          Components: Hive, llap
>            Reporter: Panagiotis Garefalakis
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-22731.1.patch, HIVE-22731.2.patch, HIVE-22731.WIP.patch, decode_time_bars.pdf
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, RecordReaders such as ORC support filtering at coarser-grained levels, namely:
File, Stripe (64 to 256mb), and Row group (10k row) level. They only filter sets of rows
if they can guarantee that none of the rows can pass a filter (usually given as searchable
argument).
> However, a significant amount of time can be spend decoding rows with multiple columns
that are not even used in the final result. See figure where original is what happens today
and in LazyDecode we skip decoding rows that do not match the key.
> To enable a more fine-grained filtering in the particular case of a MapJoin we could
utilize the key HashTable created from the smaller table to skip deserializing row columns
at the larger table that do not match any key and thus save CPU time. 
> This Jira investigates this direction. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message