hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "j.barrett Strausser" <j.barrett.straus...@gmail.com>
Subject Tablesample doubling
Date Tue, 30 Jul 2013 01:51:35 GMT
Hello All,

Why does TABLESAMPLE(N rows) produce ouptut with 2*N rows?


I have the following script:

DROP TABLE IF EXISTS sparse_features_small;

CREATE TABLE sparse_features_small ROW FORMAT DELIMITED FIELDS TERMINATED
BY ',' LINES TERMINATED BY '\n' as

SELECT
        *
FROM
        sparse_features
TABLESAMPLE(50000 ROWS)


After I execute this by sourcing the file, I can then execute :







-- 


https://github.com/bearrito
@deepbearrito

Mime
View raw message