hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization
Date Sat, 15 Mar 2014 10:53:42 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936136#comment-13936136
] 

Hive QA commented on HIVE-6455:
-------------------------------



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12634884/HIVE-6455.17.patch

Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1828/testReport
Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1828/console

Messages:
{noformat}
**** This message was trimmed, see log for full details ****
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/hwi/src/test/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-hwi ---
[INFO] Executing tasks

main:
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/tmp
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/warehouse
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/tmp/conf
     [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-hwi ---
[INFO] Compiling 2 source files to /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-hwi ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-hwi ---
[INFO] Building jar: /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/hive-hwi-0.14.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-hwi ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-hwi ---
[INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/hive-hwi-0.14.0-SNAPSHOT.jar
to /data/hive-ptest/working/maven/org/apache/hive/hive-hwi/0.14.0-SNAPSHOT/hive-hwi-0.14.0-SNAPSHOT.jar
[INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/hwi/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-hwi/0.14.0-SNAPSHOT/hive-hwi-0.14.0-SNAPSHOT.pom
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building Hive ODBC 0.14.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-odbc ---
[INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/odbc (includes = [datanucleus.log,
derby.log], excludes = [])
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-odbc ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-odbc ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-odbc ---
[INFO] Executing tasks

main:
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/odbc/target/tmp
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/odbc/target/warehouse
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/odbc/target/tmp/conf
     [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/odbc/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-odbc ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-odbc ---
[INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/odbc/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-odbc/0.14.0-SNAPSHOT/hive-odbc-0.14.0-SNAPSHOT.pom
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building Hive Shims Aggregator 0.14.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-shims-aggregator ---
[INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/shims (includes = [datanucleus.log,
derby.log], excludes = [])
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-shims-aggregator ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-shims-aggregator ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-shims-aggregator ---
[INFO] Executing tasks

main:
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/shims/target/tmp
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/shims/target/warehouse
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/shims/target/tmp/conf
     [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/shims/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-shims-aggregator
---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-shims-aggregator ---
[INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/shims/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-shims-aggregator/0.14.0-SNAPSHOT/hive-shims-aggregator-0.14.0-SNAPSHOT.pom
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building Hive TestUtils 0.14.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-testutils ---
[INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/testutils (includes = [datanucleus.log,
derby.log], excludes = [])
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-testutils ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-testutils ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/testutils/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-testutils ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-testutils ---
[INFO] Compiling 2 source files to /data/hive-ptest/working/apache-svn-trunk-source/testutils/target/classes
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hive-testutils
---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/testutils/src/test/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-testutils ---
[INFO] Executing tasks

main:
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/testutils/target/tmp
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/testutils/target/warehouse
    [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/testutils/target/tmp/conf
     [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/testutils/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-testutils ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-testutils ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-testutils ---
[INFO] Building jar: /data/hive-ptest/working/apache-svn-trunk-source/testutils/target/hive-testutils-0.14.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-testutils ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-testutils ---
[INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/testutils/target/hive-testutils-0.14.0-SNAPSHOT.jar
to /data/hive-ptest/working/maven/org/apache/hive/hive-testutils/0.14.0-SNAPSHOT/hive-testutils-0.14.0-SNAPSHOT.jar
[INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/testutils/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-testutils/0.14.0-SNAPSHOT/hive-testutils-0.14.0-SNAPSHOT.pom
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building Hive Packaging 0.14.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
Downloading: http://repository.apache.org/snapshots/org/apache/hive/hcatalog/hive-hcatalog-hbase-storage-handler/0.14.0-SNAPSHOT/maven-metadata.xml
Downloading: http://repository.apache.org/snapshots/org/apache/hive/hcatalog/hive-hcatalog-hbase-storage-handler/0.14.0-SNAPSHOT/hive-hcatalog-hbase-storage-handler-0.14.0-SNAPSHOT.pom
[WARNING] The POM for org.apache.hive.hcatalog:hive-hcatalog-hbase-storage-handler:jar:0.14.0-SNAPSHOT
is missing, no dependency information available
Downloading: http://repository.apache.org/snapshots/org/apache/hive/hcatalog/hive-hcatalog-hbase-storage-handler/0.14.0-SNAPSHOT/hive-hcatalog-hbase-storage-handler-0.14.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Hive .............................................. SUCCESS [8.636s]
[INFO] Hive Ant Utilities ................................ SUCCESS [5.854s]
[INFO] Hive Shims Common ................................. SUCCESS [3.686s]
[INFO] Hive Shims 0.20 ................................... SUCCESS [2.962s]
[INFO] Hive Shims Secure Common .......................... SUCCESS [4.126s]
[INFO] Hive Shims 0.20S .................................. SUCCESS [2.592s]
[INFO] Hive Shims 0.23 ................................... SUCCESS [7.946s]
[INFO] Hive Shims ........................................ SUCCESS [1.282s]
[INFO] Hive Common ....................................... SUCCESS [7.701s]
[INFO] Hive Serde ........................................ SUCCESS [11.063s]
[INFO] Hive Metastore .................................... SUCCESS [33.965s]
[INFO] Hive Query Language ............................... SUCCESS [1:12.280s]
[INFO] Hive Service ...................................... SUCCESS [7.842s]
[INFO] Hive JDBC ......................................... SUCCESS [2.920s]
[INFO] Hive Beeline ...................................... SUCCESS [2.915s]
[INFO] Hive CLI .......................................... SUCCESS [1.766s]
[INFO] Hive Contrib ...................................... SUCCESS [2.467s]
[INFO] Hive HBase Handler ................................ SUCCESS [2.575s]
[INFO] Hive HCatalog ..................................... SUCCESS [0.554s]
[INFO] Hive HCatalog Core ................................ SUCCESS [1.992s]
[INFO] Hive HCatalog Pig Adapter ......................... SUCCESS [2.549s]
[INFO] Hive HCatalog Server Extensions ................... SUCCESS [1.493s]
[INFO] Hive HCatalog Webhcat Java Client ................. SUCCESS [1.135s]
[INFO] Hive HCatalog Webhcat ............................. SUCCESS [9.615s]
[INFO] Hive HWI .......................................... SUCCESS [1.289s]
[INFO] Hive ODBC ......................................... SUCCESS [0.890s]
[INFO] Hive Shims Aggregator ............................. SUCCESS [0.129s]
[INFO] Hive TestUtils .................................... SUCCESS [0.616s]
[INFO] Hive Packaging .................................... FAILURE [1.646s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3:29.195s
[INFO] Finished at: Sat Mar 15 06:52:37 EDT 2014
[INFO] Final Memory: 74M/458M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project hive-packaging: Could not resolve dependencies for
project org.apache.hive:hive-packaging:pom:0.14.0-SNAPSHOT: Could not find artifact org.apache.hive.hcatalog:hive-hcatalog-hbase-storage-handler:jar:0.14.0-SNAPSHOT
in apache.snapshots (http://repository.apache.org/snapshots) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following
articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hive-packaging
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12634884

> Scalable dynamic partitioning and bucketing optimization
> --------------------------------------------------------
>
>                 Key: HIVE-6455
>                 URL: https://issues.apache.org/jira/browse/HIVE-6455
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.13.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>              Labels: optimization
>         Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.10.patch, HIVE-6455.10.patch,
HIVE-6455.11.patch, HIVE-6455.12.patch, HIVE-6455.13.patch, HIVE-6455.13.patch, HIVE-6455.14.patch,
HIVE-6455.15.patch, HIVE-6455.16.patch, HIVE-6455.17.patch, HIVE-6455.2.patch, HIVE-6455.3.patch,
HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch, HIVE-6455.6.patch, HIVE-6455.7.patch,
HIVE-6455.8.patch, HIVE-6455.9.patch, HIVE-6455.9.patch
>
>
> The current implementation of dynamic partition works by keeping at least one record
writer open per dynamic partition directory. In case of bucketing there can be multispray
file writers which further adds up to the number of open record writers. The record writers
of column oriented file format (like ORC, RCFile etc.) keeps some sort of in-memory buffers
(value buffer or compression buffers) open all the time to buffer up the rows and compress
them before flushing it to disk. Since these buffers are maintained per column basis the amount
of constant memory that will required at runtime increases as the number of partitions and
number of columns per partition increases. This often leads to OutOfMemory (OOM) exception
in mappers or reducers depending on the number of open record writers. Users often tune the
JVM heapsize (runtime memory) to get over such OOM issues. 
> With this optimization, the dynamic partition columns and bucketing columns (in case
of bucketed tables) are sorted before being fed to the reducers. Since the partitioning and
bucketing columns are sorted, each reducers can keep only one record writer open at any time
thereby reducing the memory pressure on the reducers. This optimization is highly scalable
as the number of partition and number of columns per partition increases at the cost of sorting
the columns.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message