impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3227: generate test TPC data sets during data load
Date Wed, 27 Jul 2016 15:52:50 GMT
Hello Jim Apple, Internal Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3761

to look at the new patch set (#7).

Change subject: IMPALA-3227: generate test TPC data sets during data load
......................................................................

IMPALA-3227: generate test TPC data sets during data load

The generated data is identical to the pregenerated tpch.tar.gz
and tpcds.tar.gz data that was used previously and were not
publically accessible.

This adds a "preload" hook to bin/load-data.py that can execute custom
logic for each data set. This is used to call the TPC-H and TPC-DS data
generation utilities that are already available in the Impala toolchain.

Testing:
Ran private test job with loading from snapshot disabled and without
the tpch/tpcds tarballs available.

Change-Id: Ieccfbd7d8d4a91bffddbe35abb7f5572e71a71cf
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M bin/load-data.py
A testdata/datasets/tpcds/preload
A testdata/datasets/tpch/preload
5 files changed, 101 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/3761/7
-- 
To view, visit http://gerrit.cloudera.org:8080/3761
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ieccfbd7d8d4a91bffddbe35abb7f5572e71a71cf
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <jbapple@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message