hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-22221) Llap external client - Need to reduce LlapBaseInputFormat#getSplits() footprint
Date Sun, 22 Sep 2019 12:58:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-22221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16935303#comment-16935303
] 

Hive QA commented on HIVE-22221:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12981003/HIVE-22221.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/18681/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18681/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18681/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and
output '+ date '+%Y-%m-%d %T.%3N'
2019-09-22 12:56:37.348
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-18681/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2019-09-22 12:56:37.350
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 25f0fb4 HIVE-20113: Shuffle avoidance: Disable 1-1 edges for sorted shuffle
(Vineet Garg, Gopal V reviewed by Jesus Camacho Rodriguez)
+ git clean -f -d
Removing ${project.basedir}/
Removing itests/${project.basedir}/
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 25f0fb4 HIVE-20113: Shuffle avoidance: Disable 1-1 edges for sorted shuffle
(Vineet Garg, Gopal V reviewed by Jesus Camacho Rodriguez)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2019-09-22 12:56:38.268
+ rm -rf ../yetus_PreCommit-HIVE-Build-18681
+ mkdir ../yetus_PreCommit-HIVE-Build-18681
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-18681
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-18681/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch
Going to apply patch with: git apply -p0
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: executing: [/tmp/protoc2293711469475654133.exe, --version]
libprotoc 2.5.0
protoc-jar: executing: [/tmp/protoc2293711469475654133.exe, -I/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore,
--java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/target/generated-sources,
/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process
(process-resource-bundles) on project hive-pre-upgrade: Execution process-resource-bundles
of goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process failed. ConcurrentModificationException
-> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following
articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hive-pre-upgrade
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-18681
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12981003 - PreCommit-HIVE-Build

> Llap external client - Need to reduce LlapBaseInputFormat#getSplits() footprint  
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-22221
>                 URL: https://issues.apache.org/jira/browse/HIVE-22221
>             Project: Hive
>          Issue Type: Bug
>          Components: llap, UDF
>            Reporter: Shubham Chaurasia
>            Assignee: Shubham Chaurasia
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-22221.1.patch, HIVE-22221.2.patch, HIVE-22221.3.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> While querying through llap external client, LlapBaseInputFormat#getSplits() invokes
get_splits() (GenericUDTFGetSplits) udtf under the hoods.
> GenericUDTFGetSplits returns LlapInputSplit in which planBytes[] occupies around 90%
of the split size.
> Depending on data size/partitions and plan,  LlapInputSplit can grow upto 1mb with planBytes[]
being common to all the splits and occupying more than 850 kb. Also, it sometimes causes OOM
on HS2 depending on HS2 heap size.
> This can be resolved by separating out common parts from actual splits and reassembling
them at client side. 
> We can also provide an option where client can say it does not want to reassemble them
and can take the control of reassembling in it's hands.
> Splits can be broken like:
> 1) schema split
> 2) plan split
> 3) actual split 1
> 4) actual split 2....and so on.
> This greatly reduces the memory(in my case from 5GB(~5000 splits) to around 15MB) on
server side  and hence the data transfer. And this eliminates OOM on HS2 side.
> cc [~jdere] [~sankarh] [~thejas]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message