drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5978) Upgrade Hive libraries to 2.1.1 version.
Date Tue, 06 Feb 2018 08:17:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353548#comment-16353548
] 

ASF GitHub Bot commented on DRILL-5978:
---------------------------------------

Github user sohami commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1111#discussion_r166213941
  
    --- Diff: contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveUtilities.java
---
    @@ -507,5 +509,51 @@ public static boolean hasHeaderOrFooter(HiveTableWithColumnCache
table) {
         int skipFooter = retrieveIntProperty(tableProperties, serdeConstants.FOOTER_COUNT,
-1);
         return skipHeader > 0 || skipFooter > 0;
       }
    +
    +  /**
    +   * This method checks whether the schema evolution properties are set in job conf for
the input format. If they
    +   * aren't set, method sets the column names and types from table/partition properties
or storage descriptor.
    +   * @param job the job to update
    +   * @param properties table or partition properties
    +   * @param isAcidTable true if the table is transactional, false otherwise
    +   * @param sd storage descriptor
    +   */
    +  public static void setColumnTypes(JobConf job, Properties properties, boolean isAcidTable,
StorageDescriptor sd) {
    +
    +    // No work is needed, if schema evolution is used
    +    if (Utilities.isSchemaEvolutionEnabled(job, isAcidTable) && job.get(IOConstants.SCHEMA_EVOLUTION_COLUMNS)
!= null &&
    +        job.get(IOConstants.SCHEMA_EVOLUTION_COLUMNS_TYPES) != null) {
    +      return;
    +    }
    +
    +    String colNames;
    +    String colTypes;
    +
    +    // Try to get get column names and types from table or partition properties. If they
are absent there, get columns
    +    // data from storage descriptor of the table
    +    if (properties.containsKey(serdeConstants.LIST_COLUMNS) && properties.containsKey(serdeConstants.LIST_COLUMN_TYPES))
{
    +      colNames = job.get(serdeConstants.LIST_COLUMNS);
    +      colTypes = job.get(serdeConstants.LIST_COLUMN_TYPES);
    +    } else {
    +      StringBuilder colNamesBuilder = new StringBuilder();
    +      StringBuilder colTypesBuilder = new StringBuilder();
    +      boolean isFirst = true;
    +      for(FieldSchema col: sd.getCols()) {
    +        if (isFirst) {
    +          isFirst = false;
    +        } else {
    +          colNamesBuilder.append(',');
    +          colTypesBuilder.append(',');
    +        }
    +        colNamesBuilder.append(col.getName());
    +        colTypesBuilder.append(col.getType());
    +      }
    +      colNames = colNamesBuilder.toString();
    +      colTypes = colTypesBuilder.toString();
    --- End diff --
    
    how about changing the loop as below:
    
    ```
    final StringBuilder colNamesBuilder = new StringBuilder();
    final StringBuilder colTypesBuilder = new StringBuilder();
    
         for(FieldSchema col: sd.getCols()) {
               colNamesBuilder.append(col.getName());
               colTypesBuilder.append(col.getType());
               colNamesBuilder.append(',');
               colTypesBuilder.append(',');
          }
          colNames = colNamesBuilder.substring(0, colNamesBuilder.length() - 1);
          colTypes = colTypesBuilder.substring(0, colTypesBuilder.length() - 1);
    ```



> Upgrade Hive libraries to 2.1.1 version.
> ----------------------------------------
>
>                 Key: DRILL-5978
>                 URL: https://issues.apache.org/jira/browse/DRILL-5978
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Hive
>    Affects Versions: 1.11.0
>            Reporter: Vitalii Diravka
>            Assignee: Vitalii Diravka
>            Priority: Major
>              Labels: doc-impacting
>             Fix For: 1.13.0
>
>
> Currently Drill uses [Hive version 1.2.1 libraries|https://github.com/apache/drill/blob/master/pom.xml#L53]
to perform queries on Hive. This version of library can be used for Hive1.x versions and Hive2.x
versions too, but some features of Hive2.x are broken (for example using of ORC transactional
tables). To fix that it will be good to update drill-hive library version to 2.1 or newer.

> Tasks which should be done:
> - resolving dependency conflicts;
> - investigating backward compatibility of newer drill-hive library with older Hive versions
(1.x);
> - updating drill-hive version for [MapR|https://github.com/apache/drill/blob/master/pom.xml#L1777]
profile too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message