hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Jenkins Server <>
Subject Hive-trunk-hadoop2 - Build # 300 - Still Failing
Date Mon, 22 Jul 2013 01:03:36 GMT
Changes for Build #267
[hashutosh] HIVE-4781 : Adding new data files for tests. Missed in original commit.

Changes for Build #268
[navis] HIVE-2517 : Support group by on struct type (Ashutosh Chauhan via Navis)

[hashutosh] HIVE-4406 : Missing / or /<dbname> in hs2 jdbc uri switches mode to embedded
mode(Anandha Ranganathan via Ashutosh Chauhan)

[hashutosh] HIVE-4430 : Semantic analysis fails in presence of certain literals in on clause
(Kevin Wilfong via Ashutosh Chauhan)

[hashutosh] HIVE-4757 : LazyTimestamp goes into irretrievable NULL mode once inited with NULL
once (Gopal V via Ashutosh Chauhan)

[hashutosh] HIVE-4785 : Implement isCaseSensitive for Hive JDBC driver (Robert Roland via
Ashutosh Chauhan)

Changes for Build #269
[navis] HIVE-4436 : hive.exec.parallel=true doesn't work on hadoop-2
 (Gopal V via Navis)

Changes for Build #270
[hashutosh] HIVE-4689 : For outerjoins, joinEmitInterval might make wrong result (Navis via
Ashutosh Chauhan)

[hashutosh] HIVE-3253 : ArrayIndexOutOfBounds exception for deeply nested structs (Thejas
Nair via Ashutosh Chauhan)

Changes for Build #271

Changes for Build #272

Changes for Build #273
[hashutosh] HIVE-4089 : javax.jdo : jdo2-api dependency not in Maven Central (Navis via Ashutosh

[ecapriolo] HIVE-4804 parallel order by fails for small datasets (Navis via egc)

Submitted by:	Navis
Reviewed by:	Edward Capriolo

Changes for Build #274

Changes for Build #275
[hashutosh] HIVE-4811 : (Slightly) break up the SemanticAnalyzer monstrosity (Gunther Hagleitner
via Ashutosh Chauhan)

[hashutosh] HIVE-4814 : Adjust WebHCat e2e tests until HIVE4703 is addressed (Eugene Koifman
via Ashutosh Chauhan)

Changes for Build #276
[hashutosh] HIVE-4251 : Indices can't be built on tables whose schema info comes from SerDe
(Mark Wagner via Ashutosh Chauhan)

[hashutosh] HIVE-4805 : Enhance coverage of package org.apache.hadoop.hive.ql.exec.errors
(Ivan Veselovsky via Ashutosh Chauhan)

Changes for Build #277
[hashutosh] HIVE-4733 : HiveLockObjectData is not compared properly (Navis via Ashutosh Chauhan)

[ecapriolo] HIVE-3475 INLINE UDTF does not convert types properly (Igor Kabiljo and Navis
Ryu via egc)

Submitted by:	Navis Ryu and Igor Kabiljo
Reviewed by:	Edward Capriolo

Changes for Build #278
[hashutosh] HIVE-4802 : Fix url check for missing / or /<db> after hostname in jdb uri
(Thejas Nair via Ashutosh Chauhan)

Changes for Build #279
[hashutosh] HIVE-3810 : HiveHistory.log need to replace \r with space before writing Entry.value
to historyfile (Mark Grover via Ashutosh Chauhan)

Changes for Build #280
[hashutosh] HIVE-4819 : Comments in CommonJoinOperator for aliasTag is not valid (Navis via
Ashutosh Chauhan)

[hashutosh] HIVE-4813 : Improve test coverage of package org.apache.hadoop.hive.ql.optimizer.pcr
(Ivan Veselovsky via Ashutosh Chauhan)

[hashutosh] HIVE-4580 : Change DDLTask to report errors using canonical error messages rather
than http status codes (Eugene Koifman via Ashutosh Chauhan)

[hashutosh] HIVE-4796 : Increase coverage of package org.apache.hadoop.hive.common.metrics
(Ivan Veselovsky via Ashutosh Chauhan)

[navis] HIVE-4812 : Logical explain plan (Gunther Hagleitner V via Navis)

Changes for Build #281
[hashutosh] HIVE-4833 : Fix eclipse template classpath to include the correct jdo lib (Yin
Huai via Ashutosh Chauhan)

[hashutosh] HIVE-4830 : Test clientnegative/nested_complex_neg.q got broken due to 4580 (Vikram
Dixit via Ashutosh Chauhan)

[hashutosh] HIVE-4810 [jira] Refactor exec package
(Gunther Hagleitner via Ashutosh Chauhan)


The exec package contains both operators and classes used to execute the job. Moving the latter
into a sub package makes the package slightly more manageable and will make it easier to provide
a tez-based implementation.

Test Plan: Refactoring

Reviewers: ashutoshc

Reviewed By: ashutoshc

Differential Revision:

[hashutosh] HIVE-4829 : TestWebHCatE2e checkstyle violation causes all tests to fail (Eugene
Koifman via Ashutosh Chauhan)

Changes for Build #282
[hashutosh] HIVE-3691 : TestDynamicSerDe failed with IBM JDK (Bing Li & Renata Ghisloti
via Ashutosh Chauhan)

[hashutosh] HIVE-4807 : Hive metastore hangs (Sarvesh Sakalanaga via Ashutosh Chauhan)

Changes for Build #283

Changes for Build #284

Changes for Build #285
[hashutosh] HIVE-4840 : Fix eclipse template classpath to include the BoneCP lib (Yin Huai
via Ashutosh Chauhan)

Changes for Build #286
[navis] HIVE-4290 : Build profiles: Partial builds for quicker dev (Gunther Hagleitner via

[navis] HIVE-4658 : Make KW_OUTER optional in outer joins (Edward Capriolo via Navis)

Changes for Build #287

Changes for Build #288

Changes for Build #289
[hashutosh] HIVE-4852 : -Dbuild.profile=core fails (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4854 : testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop
2 (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4853 : junit timeout needs to be updated (Gunther Hagleitner via Ashutosh

[hashutosh] HIVE-4721 : Fix TestCliDriver.ptf_npath.q on 0.23 (Gunther Hagleitner via Ashutosh

Changes for Build #290
[ecapriolo] HIVE-3603 Enable client-side caching for scans on HBase (Navis Ryu via EGC)

Submitted by:	Navis Ryu
Reviewed by:	Edward Capriolo

Changes for Build #291
[hashutosh] HIVE-4845 : Correctness issue with MapJoins using the null safe operator (Brock
Noland via Ashutosh Chauhan)

Changes for Build #292
[daijy] HIVE-4820 : should set default values for HIVE_HOME and HCAT_PREFIX
that work with default build tree structure (Eugene Koifman via Jianyong Dai)

Changes for Build #293
[brock] HIVE-4865 - HiveLockObjects: Unlocking retries/times out when query contains ":" (Gunther
Hagleitner via Brock Noland)

Changes for Build #294
[hashutosh] HIVE-2206 [jira] add a new optimizer for query correlation discovery and optimization
(Yin Huai via Ashutosh Chauhan)

update test results

This issue proposes a new logical optimizer called Correlation Optimizer, which is used to
merge correlated MapReduce jobs (MR jobs) into a single MR job. The idea is based on YSmart
( The paper and slides of YSmart are linked at the bottom.

Since Hive translates queries in a sentence by sentence fashion, for every operation which
may need to shuffle the data (e.g. join and aggregation operations), Hive will generate a
MapReduce job for that operation. However, for those operations which may need to shuffle
the data, they may involve correlations explained below and thus can be executed in a single
MR job.

	Input Correlation: Multiple MR jobs have input correlation (IC) if their input relation sets
are not disjoint;
	Transit Correlation: Multiple MR jobs have transit correlation (TC) if they have not only
input correlation, but also the same partition key;
	Job Flow Correlation: An MR has job flow correlation (JFC) with one of its child nodes if
it has the same partition key as that child node.

The current implementation of correlation optimizer only detect correlations among MR jobs
for reduce-side join operators and reduce-side aggregation operators (not map only aggregation).
A query will be optimized if it satisfies following conditions.

	There exists a MR job for reduce-side join operator or reduce side aggregation operator which
have JFC with all of its parents MR jobs (TCs will be also exploited if JFC exists);
	All input tables of those correlated MR job are original input tables (not intermediate tables
generated by sub-queries); and
	No self join is involved in those correlated MR jobs.

Correlation optimizer is implemented as a logical optimizer. The main reasons are that it
only needs to manipulate the query plan tree and it can leverage the existing component on
generating MR jobs.

Current implementation can serve as a framework for correlation related optimizations. I think
that it is better than adding individual optimizers.

There are several work that can be done in future to improve this optimizer. Here are three

	Support queries only involve TC;
	Support queries in which input tables of correlated MR jobs involves intermediate tables;
	Optimize queries involving self join.

Paper and presentation of YSmart.

Test Plan: EMPTY

Reviewers: JIRA, ashutoshc

Reviewed By: ashutoshc

CC: brock

Differential Revision:

[ecapriolo] HIVE-4873 Sort candidate functions in case of UDFArgumentException (Xuefu Zhang
via egc)

Submitted by:	Xuefu Zhang
Reviewed by:	Edward Capriolo

Changes for Build #295
[ecapriolo] HIVE-4675 Create new parallel unit test environment (Brock Noland via egc)

Submitted by: Brock Noland	
Reviewed by: Edward Capriolo

Changes for Build #296

Changes for Build #297
[gates] Enable parallel execution of various E2E tests (deepeshk via gates)

[hashutosh] HIVE-4730 : Join on more than 2^31 records on single reducer failed (wrong results)
(Navis via Ashutosh Chauhan)

[brock] HIVE-4818: SequenceId in operator is not thread safe (Edward Capriolo via Brock Noland)

[brock] HIVE-4874 Identical methods PTFDeserializer.addOIPropertiestoSerDePropsMap(), PTFTranslator.addOIPropertiestoSerDePropsMap()
(Edward Capriolo via Brock Noland

Changes for Build #298
[omalley] HIVE-4724 Better detection of non-ORC files in the ORC reader (omalley)

Changes for Build #299
[hashutosh] HIVE-4877 : In ExecReducer, remove tag from the row which will be passed to the
first Operator at the Reduce-side (Yin Huai via Ashutosh Chauhan)

Changes for Build #300
[ecapriolo] Serde-reported partition cols should not be persisted in metastore (Travis Crawford
via egc)

Submitted by:	Travis Crawford
Reviewed by:	Edward Capriolo

[brock] HIVE-4858: Sort 'show grant' result to improve usability and testability (Xuefu Zhang
via Brock Noland)

No tests ran.

The Apache Jenkins build system has built Hive-trunk-hadoop2 (build #300)

Status: Still Failing

Check console output at to view the
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message