+ + + + + +
+

Apache Drill 1.10.0 Release Notes

+ +
+ + + + + +
+ +

Release date: March 15, 2017

+ +

Today, we're happy to announce the availability of Drill 1.10.0. You can download it here.

+ +

New Features and Improvements

+ +

This release of Drill provides the following new features and improvements:

+ +
    +
  • Support for the CREATE TEMPORARY TABLE AS (CTTAS) command.
  • +
  • A JDBC connection option that improves fault tolerance when connecting directly to a Drill node from a client.
  • +
  • The Web Console displays the Drill version and additional query profile statistics.
  • +
  • Drill implicitly interprets the INT96 timestamp data type in Parquet files.
  • +
  • Support for Kerberos authentication between the client and drillbit.
  • +
+ +

The following sections list additional bug fixes and improvements:

+ +

Sub-task +

+ +
    +
  • [DRILL-4272] - When sort runs out of memory and query fails, resources are seemingly not freed +
  • +
  • [DRILL-4301] - OOM : Unable to allocate sv2 for 1000 records, and not enough batchGroups to spill. +
  • +
  • [DRILL-4730] - Update JDBC DatabaseMetaData implementation to use new Metadata APIs +
  • +
  • [DRILL-5008] - Refactor, document and simplify ExternalSortBatch +
  • +
  • [DRILL-5011] - External Sort Batch memory use depends on record width +
  • +
  • [DRILL-5014] - ExternalSortBatch cache size, spill count differs from config setting +
  • +
  • [DRILL-5017] - Config param drill.exec.sort.external.batch.size is not used +
  • +
  • [DRILL-5019] - ExternalSortBatch spills all batches to disk even if even one spills +
  • +
  • [DRILL-5020] - ExternalSortBatch has inconsistent notions of the memory limit +
  • +
  • [DRILL-5022] - ExternalSortBatch sets two different limits for "copier" memory +
  • +
  • [DRILL-5023] - ExternalSortBatch does not spill fully, throws off spill calculations +
  • +
  • [DRILL-5025] - ExternalSortBatch provides weak control over spill file size +
  • +
  • [DRILL-5026] - ExternalSortBatch uses two memory allocators; one will do +
  • +
  • [DRILL-5027] - ExternalSortBatch is inefficient: rewrites data unnecessarily +
  • +
  • [DRILL-5055] - External Sort does not delete spill file if error occurs during close +
  • +
  • [DRILL-5062] - External sort refers to the deprecated HDFS fs.default.name param +
  • +
  • [DRILL-5066] - External sort attempts to retry sv2 memory alloc, even if can never succeed +
  • +
  • [DRILL-5210] - External Sort BatchGroup leaks memory if an OOM occurs during read +
  • +
  • [DRILL-5262] - NPE in managed external sort while spilling to disk +
  • +
  • [DRILL-5264] - Managed External Sort fails with OOM +
  • +
  • [DRILL-5267] - Managed external sort spills too often with Parquet data +
  • +
  • [DRILL-5285] - Provide detailed, accurate estimate of size consumed by a record batch +
  • +
  • [DRILL-5294] - Managed External Sort throws an OOM during the merge and spill phase +
  • +
+ + +

Bug +

+ +
    +
  • [DRILL-1808] - Large compilation unit tests fails due to high memory allocation +
  • +
  • [DRILL-2293] - CTAS does not clean up when it fails +
  • +
  • [DRILL-3562] - Query fails when using flatten on JSON data where some documents have an empty array +
  • +
  • [DRILL-4578] - "children" missing from results of full scan over JSON data +
  • +
  • [DRILL-4764] - Parquet file with INT_16, etc. logical types not supported by simple SELECT +
  • +
  • [DRILL-4812] - Wildcard queries fail on Windows +
  • +
  • [DRILL-4850] - TPCDS Query 33 failed in the second and 3rd runs, but succeeded in the 1st run +
  • +
  • [DRILL-4872] - NPE from CTAS partitioned by a projected casted null +
  • +
  • [DRILL-4919] - Fix select count(1) / count(*) on csv with header +
  • +
  • [DRILL-4938] - Report UserException when constant expression reduction fails +
  • +
  • [DRILL-4963] - Issues when overloading Drill native functions with dynamic UDFs +
  • +
  • [DRILL-4982] - Hive Queries degrade when queries switch between different formats +
  • +
  • [DRILL-4994] - Prepared statement stopped working between 1.8.0 client and < 1.8.0 server +
  • +
  • [DRILL-4995] - Allow lazy init when dynamic UDF support is disabled +
  • +
  • [DRILL-4996] - Parquet Date auto-correction is not working in auto-partitioned parquet files generated by drill-1.6 +
  • +
  • [DRILL-5005] - Potential issues with external sort info in query profile +
  • +
  • [DRILL-5015] - As per documentation, when issuing a list of drillbits in the connection string, we always attempt to connect only to the first one +
  • +
  • [DRILL-5032] - Drill query on hive parquet table failed with OutOfMemoryError: Java heap space +
  • +
  • [DRILL-5034] - Select timestamp from hive generated parquet always return in UTC +
  • +
  • [DRILL-5039] - NPE - CTAS PARTITION BY (<char-type-column>) +
  • +
  • [DRILL-5040] - Interrupted CTAS should not succeed & should not create physical file on disk +
  • +
  • [DRILL-5044] - Fix retry logic to handle VersionMismatchException by not deleting jars in remote UDFs area +
  • +
  • [DRILL-5048] - Fix type mismatch error in case statement with null timestamp +
  • +
  • [DRILL-5050] - C++ client library has symbol resolution issues when loaded by a process that already uses boost::asio +
  • +
  • [DRILL-5051] - DRILL-5051: Fix incorrect result returned in nest query with offset specified +
  • +
  • [DRILL-5070] - Code gen: create methods in fixed order to allow test verification +
  • +
  • [DRILL-5081] - Excessive info level logging introduced in DRILL-4203 +
  • +
  • [DRILL-5086] - ClassCastException when filter pushdown is used with a bigint or float column and metadata caching. +
  • +
  • [DRILL-5088] - Error when reading DBRef column +
  • +
  • [DRILL-5091] - JDBC unit test fail on Java 8 +
  • +
  • [DRILL-5094] - Assure Comparator to be transitive +
  • +
  • [DRILL-5097] - Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from works +
  • +
  • [DRILL-5104] - Foreman sets external sort memory allocation even for a physical plan +
  • +
  • [DRILL-5112] - Unit tests derived from PopUnitTestBase fail in IDE due to config errors +
  • +
  • [DRILL-5113] - Upgrade Maven RAT plugin to avoid annoying XML errors +
  • +
  • [DRILL-5117] - Compile error when query a json file with 1000+columns +
  • +
  • [DRILL-5119] - Update MapR version to 5.2.0.40963-mapr +
  • +
  • [DRILL-5121] - A memory leak is observed when exact case is not specified for a column in a filter condition +
  • +
  • [DRILL-5127] - Revert the fix for DRILL-4831 +
  • +
  • [DRILL-5157] - Multiple Snappy versions on class path; causes unit test failures +
  • +
  • [DRILL-5159] - ProjectMergeRule in Drill should operate on RelNodes with same convention trait. +
  • +
  • [DRILL-5164] - Equi-join query results in CompileException when inputs have large number of columns +
  • +
  • [DRILL-5167] - C++ connector does not set escape string for metadata search pattern +
  • +
  • [DRILL-5190] - Display planning and queued time for a query in its profile page +
  • +
  • [DRILL-5196] - Could not run a single MongoDB unit test case through command line or IDE +
  • +
  • [DRILL-5207] - Improve Parquet scan pipelining +
  • +
  • [DRILL-5208] - Finding path to java executable should be deterministic +
  • +
  • [DRILL-5218] - Support Disabling Heartbeats in C++ Client +
  • +
  • [DRILL-5224] - CTTAS: fix errors connected with system path delimiters (Windows) +
  • +
  • [DRILL-5230] - Translation of millisecond duration into hours is incorrect +
  • +
  • [DRILL-5238] - CTTAS: unable to resolve temporary table if workspace is indicated without schema +
  • +
  • [DRILL-5242] - The UI breaks when trying to render profiles having unknown metrics +
  • +
  • [DRILL-5243] - Fix TestContextFunctions.sessionIdUDFWithinSameSession unit test +
  • +
  • [DRILL-5252] - A condition returns always true +
  • +
  • [DRILL-5263] - Prevent left NLJoin with non scalar subqueries +
  • +
  • [DRILL-5266] - Parquet Reader produces "low density" record batches - bits vs. bytes +
  • +
  • [DRILL-5273] - CompliantTextReader exhausts 4 GB memory when reading 5000 small files +
  • +
  • [DRILL-5274] - Exception thrown in Drillbit shutdown in UDF cleanup code +
  • +
  • [DRILL-5275] - Sort spill serialization is slow due to repeated buffer allocations +
  • +
  • [DRILL-5284] - Roll-up of final fixes for managed sort +
  • +
  • [DRILL-5287] - Provide option to skip updates of ephemeral state changes in Zookeeper +
  • +
  • [DRILL-5293] - Poor performance of Hash Table due to same hash value as distribution below +
  • +
  • [DRILL-5304] - Queries fail intermittently when there is skew in data distribution +
  • +
  • [DRILL-5313] - C++ client build failure on linux +
  • +
  • [DRILL-5326] - Unit tests failures related to the SERVER_METADTA +
  • +
+ + +

Improvement +

+ +
    +
  • [DRILL-4217] - Query parquet file treat INT_16 & INT_8 as INT32 +
  • +
  • [DRILL-4280] - Kerberos Authentication +
  • +
  • [DRILL-4373] - Drill and Hive have incompatible timestamp representations in parquet +
  • +
  • [DRILL-4604] - Generate warning on Web UI if drillbits version mismatch is detected +
  • +
  • [DRILL-4864] - Add ANSI format for date/time functions +
  • +
  • [DRILL-4956] - Temporary tables support +
  • +
  • [DRILL-4980] - Upgrading of the approach of parquet date correctness status detection +
  • +
  • [DRILL-4987] - Use ImpersonationUtil in RemoteFunctionRegistry +
  • +
  • [DRILL-5043] - Function that returns a unique id per session/connection similar to MySQL's CONNECTION_ID() +
  • +
  • [DRILL-5052] - Option to debug generated Java code using an IDE +
  • +
  • [DRILL-5056] - UserException does not write full message to log +
  • +
  • [DRILL-5065] - Optimize count(*) queries on MapR-DB JSON Tables +
  • +
  • [DRILL-5080] - Create a memory-managed version of the External Sort operator +
  • +
  • [DRILL-5085] - Add / update description for dynamic UDFs directories in drill-env.sh and drill-module.conf +
  • +
  • [DRILL-5098] - Improving fault tolerance for connection between client and foreman node. +
  • +
  • [DRILL-5108] - Reduce output from Maven git-commit-id-plugin +
  • +
  • [DRILL-5116] - Enable generated code debugging in each Drill operator +
  • +
  • [DRILL-5123] - Write query profile after sending final response to client to improve latency +
  • +
  • [DRILL-5126] - Provide simplified, unified "cluster fixture" for tests +
  • +
  • [DRILL-5172] - Display elapsed time for queries in the UI +
  • +
  • [DRILL-5195] - Publish Operator and MajorFragment Stats in Profile page +
  • +
  • [DRILL-5215] - CTTAS: disallow temp tables in view expansion logic +
  • +
  • [DRILL-5221] - cancel message is delayed until queryid or data is received +
  • +
  • [DRILL-5254] - Enhance default reduction factors in optimizer +
  • +
  • [DRILL-5255] - Unit tests fail due to CTTAS temporary name space checks +
  • +
  • [DRILL-5257] - Provide option to save query profiles sync, async or not at all +
  • +
  • [DRILL-5258] - Allow "extended" mock tables access from SQL queries +
  • +
  • [DRILL-5259] - Allow listing a user-defined number of profiles +
  • +
  • [DRILL-5260] - Refinements to new "Cluster Fixture" test framework +
  • +
  • [DRILL-5270] - Improve loading of profiles listing in the WebUI +
  • +
  • [DRILL-5290] - Provide an option to build operator table once for built-in static functions and reuse it across queries. +
  • +
  • [DRILL-5301] - Add server metadata API +
  • +
+ + +

New Feature +

+ +
    +
  • [DRILL-4935] - Allow drillbits to advertise a configurable host address to Zookeeper +
  • +
  • [DRILL-4979] - Make dataport configurable +
  • +
+ + +

Task +

+ + + + + + + + + +
+