hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11325) Infinite loop in HiveHFileOutputFormat
Date Tue, 15 Nov 2016 09:29:58 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666644#comment-15666644
] 

Hive QA commented on HIVE-11325:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12746327/HIVE-11325.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2125/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2125/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2125/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and
output '+ date '+%Y-%m-%d %T.%3N'
2016-11-15 09:27:42.966
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-2125/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-11-15 09:27:42.969
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 4427eab HIVE-13931: Add support for HikariCP and replace BoneCP usage with
HikariCP (Prasanth Jayachandran reviewed by Sushanth Sowmyan)
+ git clean -f -d
Removing data/files/parquet_non_dictionary_types.txt
Removing ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java
Removing ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/
Removing ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestVectorizedColumnReader.java
Removing ql/src/test/queries/clientpositive/parquet_types_non_dictionary_encoding_vectorization.q
Removing ql/src/test/queries/clientpositive/parquet_types_vectorization.q
Removing ql/src/test/results/clientpositive/parquet_types_non_dictionary_encoding_vectorization.q.out
Removing ql/src/test/results/clientpositive/parquet_types_vectorization.q.out
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 4427eab HIVE-13931: Add support for HikariCP and replace BoneCP usage with
HikariCP (Prasanth Jayachandran reviewed by Sushanth Sowmyan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-11-15 09:27:43.905
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p0
patching file hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java
patching file hbase-handler/src/test/queries/negative/generatehfiles_bad_family_path.q
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven
ANTLR Parser Generator  Version 3.5.2
Output file /data/hiveptest/working/apache-github-source-source/metastore/target/generated-sources/antlr3/org/apache/hadoop/hive/metastore/parser/FilterParser.java
does not exist: must build /data/hiveptest/working/apache-github-source-source/metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g
org/apache/hadoop/hive/metastore/parser/Filter.g
DataNucleus Enhancer (version 4.1.6) for API "JDO"
DataNucleus Enhancer : Classpath
>>  /usr/share/maven/boot/plexus-classworlds-2.x.jar
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDatabase
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MFieldSchema
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MType
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTable
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MConstraint
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MSerDeInfo
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MOrder
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MColumnDescriptor
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MStringList
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MStorageDescriptor
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MPartition
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MIndex
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MRole
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MRoleMap
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MGlobalPrivilege
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDBPrivilege
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTablePrivilege
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MPartitionPrivilege
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTableColumnPrivilege
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MPartitionColumnPrivilege
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MPartitionEvent
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MMasterKey
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDelegationToken
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTableColumnStatistics
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MVersionTable
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MResourceUri
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MFunction
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MNotificationLog
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MNotificationNextId
DataNucleus Enhancer completed with success for 30 classes. Timings : input=196 ms, enhance=210
ms, total=406 ms. Consult the log for full details
ANTLR Parser Generator  Version 3.5.2
Output file /data/hiveptest/working/apache-github-source-source/ql/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveLexer.java
does not exist: must build /data/hiveptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
org/apache/hadoop/hive/ql/parse/HiveLexer.g
Output file /data/hiveptest/working/apache-github-source-source/ql/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveParser.java
does not exist: must build /data/hiveptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
org/apache/hadoop/hive/ql/parse/HiveParser.g
Generating vector expression code
Generating vector expression test code
[ERROR] COMPILATION ERROR : 
[ERROR] /data/hiveptest/working/apache-github-source-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java:[164,24]
cannot find symbol
  symbol:   method isDir()
  location: variable srcDir of type org.apache.hadoop.fs.Path
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile
(default-compile) on project hive-hbase-handler: Compilation failure
[ERROR] /data/hiveptest/working/apache-github-source-source/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java:[164,24]
cannot find symbol
[ERROR] symbol:   method isDir()
[ERROR] location: variable srcDir of type org.apache.hadoop.fs.Path
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following
articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hive-hbase-handler
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12746327 - PreCommit-HIVE-Build

> Infinite loop in HiveHFileOutputFormat
> --------------------------------------
>
>                 Key: HIVE-11325
>                 URL: https://issues.apache.org/jira/browse/HIVE-11325
>             Project: Hive
>          Issue Type: Bug
>          Components: HBase Handler
>    Affects Versions: 1.0.0
>            Reporter: Harsh J
>         Attachments: HIVE-11325.patch
>
>
> No idea why {{hbase_handler_bulk.q}} does not catch this if its being run regularly in
Hive builds, but here's the gist of the issue:
> The condition at https://github.com/apache/hive/blob/master/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java#L152-L164
indicates that we will infinitely loop until we find a file whose last path component (the
name) is equal to the column family name.
> In execution, however, the iteration enters an actual infinite loop cause the file we
end up considering as the srcDir name, is actually the region file, whose name will never
match the family name.
> This is an example of the IPC the listing loop of a 100% progress task gets stuck in:
> {code}
> 2015-07-21 10:32:20,662 TRACE [main] org.apache.hadoop.ipc.ProtobufRpcEngine: 1: Call
-> cdh54.vm/172.16.29.132:8020: getListing {src: "/user/hive/warehouse/hbase_test/_temporary/1/_temporary/attempt_1436935612068_0011_m_000000_0/family/97112ac1c09548ae87bd85af072d2e8c"
startAfter: "" needLocation: false}
> 2015-07-21 10:32:20,662 DEBUG [IPC Parameter Sending Thread #1] org.apache.hadoop.ipc.Client:
IPC Client (1551465414) connection to cdh54.vm/172.16.29.132:8020 from hive sending #510346
> 2015-07-21 10:32:20,662 DEBUG [IPC Client (1551465414) connection to cdh54.vm/172.16.29.132:8020
from hive] org.apache.hadoop.ipc.Client: IPC Client (1551465414) connection to cdh54.vm/172.16.29.132:8020
from hive got value #510346
> 2015-07-21 10:32:20,662 DEBUG [main] org.apache.hadoop.ipc.ProtobufRpcEngine: Call: getListing
took 0ms
> 2015-07-21 10:32:20,662 TRACE [main] org.apache.hadoop.ipc.ProtobufRpcEngine: 1: Response
<- cdh54.vm/172.16.29.132:8020: getListing {dirList { partialListing { fileType: IS_FILE
path: "" length: 863 permission { perm: 4600 } owner: "hive" group: "hive" modification_time:
1437454718130 access_time: 1437454717973 block_replication: 1 blocksize: 134217728 fileId:
33960 childrenNum: 0 storagePolicy: 0 } remainingEntries: 0 }}
> {code}
> The path we are getting out of the listing results is {{/user/hive/warehouse/hbase_test/_temporary/1/_temporary/attempt_1436935612068_0011_m_000000_0/family/97112ac1c09548ae87bd85af072d2e8c}},
but instead of checking the path's parent {{family}} we're instead looping infinitely over
its hashed filename {{97112ac1c09548ae87bd85af072d2e8c}} cause it does not match {{family}}.
> It stays in the infinite loop therefore, until the MR framework kills it away due to
an idle task timeout (and then since the subsequent task attempts fail outright, the job fails).
> While doing a {{getPath().getParent()}} will resolve that, is that infinite loop even
necessary? Especially given the fact that we throw exceptions if there are no entries or there
is more than one entry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message