impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3227: generate test TPC data sets during data load
Date Tue, 26 Jul 2016 18:28:58 GMT
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-3227: generate test TPC data sets during data load
......................................................................

IMPALA-3227: generate test TPC data sets during data load

The generated data is identical to the pregenerated tpch.tar.gz
and tpcds.tar.gz data that was used previously and were not
publically accessible.

This adds a "preload" hook to bin/load-data.py that can execute custom
logic for each data set. This is used to call the TPC-H and TPC-DS data
generation utilities that are already available in the Impala toolchain.

Testing:
Ran private test job with loading from snapshot disabled and without
the tpch/tpcds tarballs available.

Change-Id: Ieccfbd7d8d4a91bffddbe35abb7f5572e71a71cf
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M bin/load-data.py
A testdata/datasets/tpcds/preload
A testdata/datasets/tpch/preload
5 files changed, 96 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/3761/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3761
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ieccfbd7d8d4a91bffddbe35abb7f5572e71a71cf
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message